PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-4 (4)
 

Clipboard (0)
None
Journals
Authors
Year of Publication
Document Types
1.  A fused lasso latent feature model for analyzing multi-sample aCGH data 
Biostatistics (Oxford, England)  2011;12(4):776-791.
Array-based comparative genomic hybridization (aCGH) enables the measurement of DNA copy number across thousands of locations in a genome. The main goals of analyzing aCGH data are to identify the regions of copy number variation (CNV) and to quantify the amount of CNV. Although there are many methods for analyzing single-sample aCGH data, the analysis of multi-sample aCGH data is a relatively new area of research. Further, many of the current approaches for analyzing multi-sample aCGH data do not appropriately utilize the additional information present in the multiple samples. We propose a procedure called the Fused Lasso Latent Feature Model (FLLat) that provides a statistical framework for modeling multi-sample aCGH data and identifying regions of CNV. The procedure involves modeling each sample of aCGH data as a weighted sum of a fixed number of features. Regions of CNV are then identified through an application of the fused lasso penalty to each feature. Some simulation analyses show that FLLat outperforms single-sample methods when the simulated samples share common information. We also propose a method for estimating the false discovery rate. An analysis of an aCGH data set obtained from human breast tumors, focusing on chromosomes 8 and 17, shows that FLLat and Significance Testing of Aberrant Copy number (an alternative, existing approach) identify similar regions of CNV that are consistent with previous findings. However, through the estimated features and their corresponding weights, FLLat is further able to discern specific relationships between the samples, for example, identifying 3 distinct groups of samples based on their patterns of CNV for chromosome 17.
doi:10.1093/biostatistics/kxr012
PMCID: PMC3169672  PMID: 21642389
Cancer; DNA copy number; False discovery rate; Mutation
2.  Sparse inverse covariance estimation with the graphical lasso 
Biostatistics (Oxford, England)  2007;9(3):432-441.
SUMMARY
We consider the problem of estimating sparse graphs by a lasso penalty applied to the inverse covariance matrix. Using a coordinate descent procedure for the lasso, we develop a simple algorithm—the graphical lasso—that is remarkably fast: It solves a 1000-node problem (~500 000 parameters) in at most a minute and is 30–4000 times faster than competing methods. It also provides a conceptual link between the exact problem and the approximation suggested by Meinshausen and Bühlmann (2006). We illustrate the method on some cell-signaling data from proteomics.
doi:10.1093/biostatistics/kxm045
PMCID: PMC3019769  PMID: 18079126
Gaussian covariance; Graphical model; L1; Lasso
3.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis 
Biostatistics (Oxford, England)  2009;10(3):515-534.
We present a penalized matrix decomposition (PMD), a new framework for computing a rank-K approximation for a matrix. We approximate the matrix X as , where dk, uk, and vk minimize the squared Frobenius norm of X, subject to penalties on uk and vk. This results in a regularized version of the singular value decomposition. Of particular interest is the use of L1-penalties on uk and vk, which yields a decomposition of X using sparse vectors. We show that when the PMD is applied using an L1-penalty on vk but not on uk, a method for sparse principal components results. In fact, this yields an efficient algorithm for the “SCoTLASS” proposal (Jolliffe and others 2003) for obtaining sparse principal components. This method is demonstrated on a publicly available gene expression data set. We also establish connections between the SCoTLASS method for sparse principal component analysis and the method of Zou and others (2006). In addition, we show that when the PMD is applied to a cross-products matrix, it results in a method for penalized canonical correlation analysis (CCA). We apply this penalized CCA method to simulated data and to a genomic data set consisting of gene expression and DNA copy number measurements on the same set of samples.
doi:10.1093/biostatistics/kxp008
PMCID: PMC2697346  PMID: 19377034
Canonical correlation analysis; DNA copy number; Integrative genomic analysis; L1; Matrix decomposition; Principal component analysis; Sparse principal component analysis; SVD
4.  Sparse inverse covariance estimation with the graphical lasso 
Biostatistics (Oxford, England)  2007;9(3):432-441.
We consider the problem of estimating sparse graphs by a lasso penalty applied to the inverse covariance matrix. Using a coordinate descent procedure for the lasso, we develop a simple algorithm—the graphical lasso—that is remarkably fast: It solves a 1000-node problem (∼500000 parameters) in at most a minute and is 30–4000 times faster than competing methods. It also provides a conceptual link between the exact problem and the approximation suggested by Meinshausen and Bühlmann (2006). We illustrate the method on some cell-signaling data from proteomics.
doi:10.1093/biostatistics/kxm045
PMCID: PMC3019769  PMID: 18079126
Gaussian covariance; Graphical model; L1; Lasso

Results 1-4 (4)