PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-15 (15)
 

Clipboard (0)
None
Journals
Authors
more »
Year of Publication
2.  A partition-based approach to identify gene-environment interactions in genome wide association studies 
BMC Proceedings  2014;8(Suppl 1):S60.
It is believed that almost all common diseases are the consequence of complex interactions between genetic markers and environmental factors. However, few such interactions have been documented to date. Conventional statistical methods for detecting gene and environmental interactions are often based on the linear regression model, which assumes a linear interaction effect. In this study, we propose a nonparametric partition-based approach that is able to capture complex interaction patterns. We apply this method to the real data set of hypertension provided by Genetic Analysis Workshop 18. Compared with the linear regression model, the proposed approach is able to identify many additional variants with significant gene-environmental interaction effects. We further investigate one single-nucleotide polymorphism identified by our method and show that its gene-environmental interaction effect is, indeed, nonlinear. To adjust for the family dependence of phenotypes, we apply different permutation strategies and investigate their effects on the outcomes.
doi:10.1186/1753-6561-8-S1-S60
PMCID: PMC4143762  PMID: 25519395
3.  Considering interactive effects in the identification of influential regions with extremely rare variants via fixed bin approach 
BMC Proceedings  2014;8(Suppl 1):S7.
In this study, we analyze the Genetic Analysis Workshop 18 (GAW18) data to identify regions of single-nucleotide polymorphisms (SNPs), which significantly influence hypertension status among individuals. We have studied the marginal impact of these regions on disease status in the past, but we extend the method to deal with environmental factors present in data collected over several exam periods. We consider the respective interactions between such traits as smoking status and age with the genetic information and hope to augment those genetic regions deemed influential marginally with those that contribute via an interactive effect. In particular, we focus only on rare variants and apply a procedure to combine signal among rare variants in a number of "fixed bins" along the chromosome. We extend the procedure in Agne et al [1] to incorporate environmental factors by dichotomizing subjects via traits such as smoking status and age, running the marginal procedure among each respective category (i.e., smokers or nonsmokers), and then combining their scores into a score for interaction. To avoid overlap of subjects, we examine each exam period individually. Out of a possible 629 fixed-bin regions in chromosome 3, we observe that 11 show up in multiple exam periods for gene-smoking score. Fifteen regions exhibit significance for multiple exam periods for gene-age score, with 4 regions deemed significant for all 3 exam periods. The procedure pinpoints SNPs in 8 "answer" genes, with 5 of these showing up as significant in multiple testing schemes (Gene-Smoking, Gene-Age for Exams 1, 2, and 3).
doi:10.1186/1753-6561-8-S1-S7
PMCID: PMC4143804  PMID: 25519400
4.  Discovering pure gene-environment interactions in blood pressure genome-wide association studies data: a two-step approach incorporating new statistics 
BMC Proceedings  2014;8(Suppl 1):S62.
Environment has long been known to play an important part in disease etiology. However, not many genome-wide association studies take environmental factors into consideration. There is also a need for new methods to identify the gene-environment interactions. In this study, we propose a 2-step approach incorporating an influence measure that capturespure gene-environment effect. We found that pure gene-age interaction has a stronger association than considering the genetic effect alone for systolic blood pressure, measured by counting the number of single-nucleotide polymorphisms (SNPs)reaching a certain significance level. We analyzed the subjects by dividing them into two age groups and found no overlap in the top identified SNPs between them. This suggested that age might have a nonlinear effect on genetic association. Furthermore, the scores of the top SNPs for the two age subgroups were about 3times those obtained when using all subjects for systolic blood pressure. In addition, the scores of the older age subgroup were much higher than those for the younger group. The results suggest that genetic effects are stronger in older age and that genetic association studies should take environmental effects into consideration, especially age.
doi:10.1186/1753-6561-8-S1-S62
PMCID: PMC4143689  PMID: 25519396
5.  A dual-clustering framework for association screening with whole genome sequencing data and longitudinal traits 
BMC Proceedings  2014;8(Suppl 1):S47.
Current sequencing technology enables generation of whole genome sequencing data sets that contain a high density of rare variants, each of which is carried by, at most, 5% of the sampled subjects. Such variants are involved in the etiology of most common diseases in humans. These diseases can be studied by relevant longitudinal phenotype traits. Tests for association between such genotype information and longitudinal traits allow the study of the function of rare variants in complex human disorders. In this paper, we propose an association-screening framework that highlights the genotypic differences observed on rare variants and the longitudinal nature of phenotypes. In particular, both variants within a gene and longitudinal phenotypes are used to create partitions of subjects. Association between the 2 sets of constructed partitions is then evaluated. We apply the proposed strategy to the simulated data from the Genetic Analysis Workshop 18 and compare the obtained results with those from sequence kernel association test using the receiver operating characteristic curves.
doi:10.1186/1753-6561-8-S1-S47
PMCID: PMC4143709  PMID: 25519328
6.  Association screening for genes with multiple potentially rare variants: an inverse-probability weighted clustering approach 
BMC Proceedings  2011;5(Suppl 9):S106.
Both common variants and rare variants are involved in the etiology of most complex diseases in humans. Developments in sequencing technology have led to the identification of a high density of rare variant single-nucleotide polymorphisms (SNPs) on the genome, each of which affects only at most 1% of the population. Genotypes derived from these SNPs allow one to study the involvement of rare variants in common human disorders. Here, we propose an association screening approach that treats genes as units of analysis. SNPs within a gene are used to create partitions of individuals, and inverse-probability weighting is used to overweight genotypic differences observed on rare variants. Association between a phenotype trait and the constructed partition is then evaluated. We consider three association tests (one-way ANOVA, chi-square test, and the partition retention method) and compare these strategies using the simulated data from the Genetic Analysis Workshop 17. Several genes that contain causal SNPs were identified by the proposed method as top genes.
doi:10.1186/1753-6561-5-S9-S106
PMCID: PMC3287829  PMID: 22373536
7.  Identifying rare disease variants in the Genetic Analysis Workshop 17 simulated data: a comparison of several statistical approaches 
BMC Proceedings  2011;5(Suppl 9):S17.
Genome-wide association studies have been successful at identifying common disease variants associated with complex diseases, but the common variants identified have small effect sizes and account for only a small fraction of the estimated heritability for common diseases. Theoretical and empirical studies suggest that rare variants, which are much less frequent in populations and are poorly captured by single-nucleotide polymorphism chips, could play a significant role in complex diseases. Several new statistical methods have been developed for the analysis of rare variants, for example, the combined multivariate and collapsing method, the weighted-sum method and a replication-based method. Here, we apply and compare these methods to the simulated data sets of Genetic Analysis Workshop 17 and thereby explore the contribution of rare variants to disease risk. In addition, we investigate the usefulness of extreme phenotypes in identifying rare risk variants when dealing with quantitative traits. Finally, we perform a pathway analysis and show the importance of the vascular endothelial growth factor pathway in explaining different phenotypes.
doi:10.1186/1753-6561-5-S9-S17
PMCID: PMC3287851  PMID: 22373071
8.  Identifying influential regions in extremely rare variants using a fixed-bin approach 
BMC Proceedings  2011;5(Suppl 9):S3.
In this study, we analyze the Genetic Analysis Workshop 17 data to identify regions of single-nucleotide polymorphisms (SNPs) that exhibit a significant influence on response rate (proportion of subjects with an affirmative affected status), called the affected ratio, among rare variants. Under the null hypothesis, the distribution of rare variants is assumed to be uniform over case (affected) and control (unaffected) subjects. We attempt to pinpoint regions where the composition is significantly different between case and control events, specifically where there are unusually high numbers of rare variants among affected subjects. We focus on private variants, which require a degree of “collapsing” to combine information over several SNPs, to obtain meaningful results. Instead of implementing a gene-based approach, where regions would vary in size and sometimes be too small to achieve a strong enough signal, we implement a fixed-bin approach, with a preset number of SNPs per region, relying on the assumption that proximity and similarity go hand in hand. Through application of 100-SNP and 30-SNP fixed bins, we identify several most influential regions, which later are seen to contain some of the causal SNPs. The 100- and 30-SNP approaches detected seven and three causal SNPs among the most significant regions, respectively, with two overlapping SNPs located in the ELAVL4 gene, reported by both procedures.
doi:10.1186/1753-6561-5-S9-S3
PMCID: PMC3287865  PMID: 22373412
9.  New insights into old methods for identifying causal rare variants 
BMC Proceedings  2011;5(Suppl 9):S50.
The advance of high-throughput next-generation sequencing technology makes possible the analysis of rare variants. However, the investigation of rare variants in unrelated-individuals data sets faces the challenge of low power, and most methods circumvent the difficulty by using various collapsing procedures based on genes, pathways, or gene clusters. We suggest a new way to identify causal rare variants using the F-statistic and sliced inverse regression. The procedure is tested on the data set provided by the Genetic Analysis Workshop 17 (GAW17). After preliminary data reduction, we ranked markers according to their F-statistic values. Top-ranked markers were then subjected to sliced inverse regression, and those with higher absolute coefficients in the most significant sliced inverse regression direction were selected. The procedure yields good false discovery rates for the GAW17 data and thus is a promising method for future study on rare variants.
doi:10.1186/1753-6561-5-S9-S50
PMCID: PMC3287888  PMID: 22373518
10.  Rheumatoid arthritis-associated gene-gene interaction network for rheumatoid arthritis candidate genes 
BMC Proceedings  2009;3(Suppl 7):S75.
Rheumatoid arthritis (RA, MIM 180300) is a chronic and complex autoimmune disease. Using the North American Rheumatoid Arthritis Consortium (NARAC) data set provided in Genetic Analysis Workshop 16 (GAW16), we used the genotype-trait distortion (GTD) scores and proposed analysis procedures to capture the gene-gene interaction effects of multiple susceptibility gene regions on RA. In this paper, we focused on 27 RA candidate gene regions (531 SNPs) based on a literature search. Statistical significance was evaluated using 1000 permutations. HLADRB1 was found to have strong marginal association with RA. We identified 14 significant interactions (p < 0.01), which were aggregated into an association network among 12 selected candidate genes PADI4, FCGR3, TNFRSF1B, ITGAV, BTLA, SLC22A4, IL3, VEGF, TNF, NFKBIL1, TRAF1-C5, and MIF. Based on our and other contributors' findings during the GAW16 conference, we further studied 24 candidate regions with 336 SNPs. We found 23 significant interactions (p-value < 0.01), nine interactions in addition to our initial findings, and the association network was extended to include candidate genes HLA-A, HLA-B, HLA-C, CTLA4, and IL6. As we will discuss in this paper, the reported possible interactions between genes may suggest potential biological activities of RA.
PMCID: PMC2795977  PMID: 20018070
11.  Genome-wide gene-based analysis of rheumatoid arthritis-associated interaction with PTPN22 and HLA-DRB1 
BMC Proceedings  2009;3(Suppl 7):S132.
The genes PTPN22 and HLA-DRB1 have been found by a number of studies to confer an increased risk for rheumatoid arthritis (RA), which indicates that both genes play an important role in RA etiology. It is believed that they not only have strong association with RA individually, but also interact with other related genes that have not been found to have predisposing RA mutations. In this paper, we conduct genome-wide searches for RA-associated gene-gene interactions that involve PTPN22 or HLA-DRB1 using the Genetic Analysis Workshop 16 Problem 1 data from the North American Rheumatoid Arthritis Consortium. MGC13017, HSPCAL3, MIA, PTPNS1L, and IGLVI-70, which showed association with RA in previous studies, have been confirmed in our analysis.
PMCID: PMC2795906  PMID: 20017999
12.  Pattern-based mining strategy to detect multi-locus association and gene × environment interaction 
BMC Proceedings  2007;1(Suppl 1):S16.
As genome-wide association studies grow in popularity for the identification of genetic factors for common and rare diseases, analytical methods to comb through large numbers of genetic variants efficiently to identify disease association are increasingly in demand. We have developed a pattern-based data-mining approach to discover unlinked multilocus genetic effects for complex disease and to detect genotype × phenotype/genotype × environment interactions. On a densely mapped chromosome 18 data set for rheumatoid arthritis that was made available by Genetic Analysis Workshop 15, this method detected two potential two-locus associations as well as a putative two-locus gene × gender interaction.
PMCID: PMC2367515  PMID: 18466505
13.  Transcription activity hot spot, is it real or an artifact? 
BMC Proceedings  2007;1(Suppl 1):S94.
Transcription activity 'hot spots', defined as chromosome regions that contain more expression quantitative trait loci than would have been expected by chance, have been frequently detected both in humans and in model organisms. It has been common to consider the existence of hot spots as evidence for master regulation of gene expression. However, hot spots could also simply be due to highly correlated gene expressions or linkage disequilibrium and do not truly represent master regulators. A recent simulation study using real human gene expression data but simulated random single-nucleotide polymorphism genotypes showed patterns of clustering of expression quantitative trait loci that resemble those in actual studies [Perez-Enciso: Genetics 2004, 166: 547–554.]. In this study, to assess the credibility of transcription activity hot spots, we conducted genetic analyses on gene expressions provided by Genetic Analysis Workshop 15 Problem 1.
PMCID: PMC2367508  PMID: 18466598
14.  Joint study of genetic regulators for expression traits related to breast cancer 
BMC Proceedings  2007;1(Suppl 1):S10.
Background
The mRNA expression levels of genes have been shown to have discriminating power for the classification of breast cancer. Studying the heritability of gene expression levels on breast cancer related transcripts can lead to the identification of shared common regulators and inter-regulation patterns, which would be important for dissecting the etiology of breast cancer.
Results
We applied multilocus association genome-wide scans to 18 breast cancer related transcripts and combined the results with traditional linkage scans. Regulatory hotspots for these transcripts were identified and some inter-regulation patterns were observed. We also derived evidence on interacting genetic regulatory loci shared by a number of these transcripts.
Conclusion
In this paper, by restricting to a set of related genes, we were able to employ a more detailed multilocus approach that evaluates both marginal and interaction association signals at each single-nucleotide polymorphism. Interesting inter-regulation patterns and significant overlaps of genetic regulators between transcripts were observed. Interaction association results returned more expression quantitative trait locus hotspots that are significant.
PMCID: PMC2367474  PMID: 18466439
15.  Constructing gene association networks for rheumatoid arthritis using the backward genotype-trait association (BGTA) algorithm 
BMC Proceedings  2007;1(Suppl 1):S13.
Background
Rheumatoid arthritis (RA, MIM 180300) is a common and complex inflammatory disorder. The North American Rheumatoid Arthritis Consortium (NARAC) data, as part of the Genetic Analysis Workshop 15 data, consists of both genome scan and candidate gene studies on RA patients.
Results
We applied the backward genotype-trait association (BGTA) algorithm to capture marginal and gene × gene interaction effects of multiple susceptibility loci on RA disease status. A two-stage screening approach was used for the genome scan, whereas a comprehensive study of all possible subsets was conducted for the candidate genes. For the genome scan, we constructed an association network among 39 genetic loci that demonstrated strong signals, 19 of which have been reported in the RA literature. For the candidate genes, we found strong signals for PTPN22 and SUMO4. Based on significant association evidence, we built an association network among the loci of PTPN22, PADI4, DLG5, SLC22A4, SUMO4, and CARD15. To control for false positives, we used permutation tests to constrain the family-wise type I error rate to 1%.
Conclusion
Using the BGTA algorithm, we identified genetic loci and candidate genes that were associated with RA susceptibility and association networks among them. For the first time, we report possible interactions between single-nucleotide polymorphisms/genes, which may be useful for biological interpretation.
PMCID: PMC2367461  PMID: 18466472

Results 1-15 (15)