PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-3 (3)
 

Clipboard (0)
None
Journals
Year of Publication
Document Types
1.  Best Practices for Scientific Computing 
PLoS Biology  2014;12(1):e1001745.
We describe a set of best practices for scientific software development, based on research and experience, that will improve scientists' productivity and the reliability of their software.
doi:10.1371/journal.pbio.1001745
PMCID: PMC3886731  PMID: 24415924
2.  Bootstrap Aggregating of Alternating Decision Trees to Detect Sets of SNPs that Associate with Disease 
Genetic epidemiology  2012;36(2):99-106.
Complex genetic disorders are a result of a combination of genetic and non-genetic factors, all potentially interacting. Machine learning methods hold the potential to identify multi-locus and environmental associations thought to drive complex genetic traits. Decision trees, a popular machine learning technique, offer a computationally low complexity algorithm capable of detecting associated sets of SNPs of arbitrary size, including modern genome-wide SNP scans. However, interpretation of the importance of an individual SNP within these trees can present challenges.
We present a new decision tree algorithm denoted as Bagged Alternating Decision Trees (BADTrees) that is based on identifying common structural elements in a bootstrapped set of ADTrees. The algorithm is order nk2, where n is the number of SNPs considered and k is the number of SNPs in the tree constructed. Our simulation study suggests that BADTrees have higher power and lower type I error rates than ADTrees alone and comparable power with lower type I error rates compared to logistic regression. We illustrate the application of these data using simulated data as well as from the Lupus Large Association Study 1 (7822 SNPs in 3548 individuals). Our results suggest that BADTrees holds promise as a low computational order algorithm for detecting complex combinations of SNP and environmental factors associated with disease.
doi:10.1002/gepi.21608
PMCID: PMC3769952  PMID: 22851473
Machine Learning; Genetic Association; Gene-Gene Interaction; Multi-locus Models
3.  Genetic Analyses of Interferon Pathway-Related Genes Reveals Multiple New Loci Associated with Systemic Lupus Erythematosus (SLE) 
Arthritis and rheumatism  2011;63(7):2049-2057.
Objective
The overexpression of interferon (IFN)-inducible genes is a prominent feature of SLE, serves as a marker for active and more severe disease, and is also observed in other autoimmune and inflammatory conditions. The genetic variations responsible for sustained activation of IFN responsive genes are unknown.
Methods
We systematically evaluated association of SLE with a total of 1,754 IFN-pathway related genes, including IFN-inducible genes known to be differentially expressed in SLE patients and their direct regulators. We performed a three-stage design where two cohorts (total n=939 SLE cases, 3,398 controls) were analyzed independently and jointly for association with SLE, and the results were adjusted for the number of comparisons.
Results
A total of 16,137 SNPs passed all quality control filters of which 316 demonstrated replicated association with SLE in both cohorts. Nine variants were further genotyped for confirmation in an average of 1,316 independent SLE cases and 3,215 independent controls. Association with SLE was confirmed for several genes, including the transmembrane receptor CD44 (rs507230, P = 3.98×10−12), cytokine pleiotrophin (PTN) (rs919581, P = 5.38×10−04), the heat-shock DNAJA1 (rs10971259, P = 6.31×10−03), and the nuclear import protein karyopherin alpha 1 (KPNA1) (rs6810306, P = 4.91×10−02).
Conclusion
This study expands the number of candidate genes associated with SLE and highlights the potential of pathway-based approaches for gene discovery. Identification of the causal alleles will help elucidate the molecular mechanisms responsible for activation of the IFN system in SLE.
doi:10.1002/art.30356
PMCID: PMC3128183  PMID: 21437871

Results 1-3 (3)