Search tips
Search criteria

Results 1-2 (2)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
1.  Massive parallelization of serial inference algorithms for a complex generalized linear model 
Following a series of high-profile drug safety disasters in recent years, many countries are redoubling their efforts to ensure the safety of licensed medical products. Large-scale observational databases such as claims databases or electronic health record systems are attracting particular attention in this regard, but present significant methodological and computational concerns. In this paper we show how high-performance statistical computation, including graphics processing units, relatively inexpensive highly parallel computing devices, can enable complex methods in large databases. We focus on optimization and massive parallelization of cyclic coordinate descent approaches to fit a conditioned generalized linear model involving tens of millions of observations and thousands of predictors in a Bayesian context. We find orders-of-magnitude improvement in overall run-time. Coordinate descent approaches are ubiquitous in high-dimensional statistics and the algorithms we propose open up exciting new methodological possibilities with the potential to significantly improve drug safety.
PMCID: PMC4201181  PMID: 25328363
2.  Statistical Methods for Comparative Phenomics Using High-Throughput Phenotype Microarrays* 
We propose statistical methods for comparing phenomics data generated by the Biolog Phenotype Microarray (PM) platform for high-throughput phenotyping. Instead of the routinely used visual inspection of data with no sound inferential basis, we develop two approaches. The first approach is based on quantifying the distance between mean or median curves from two treatments and then applying a permutation test; we also consider a permutation test applied to areas under mean curves. The second approach employs functional principal component analysis. Properties of the proposed methods are investigated on both simulated data and data sets from the PM platform.
PMCID: PMC2942029  PMID: 20865133
functional data analysis; principal components; permutation tests; phenotype microarrays; high-throughput phenotyping; phenomics; Biolog

Results 1-2 (2)