Search tips
Search criteria

Results 1-2 (2)

Clipboard (0)
Year of Publication
Document Types
1.  A penalized EM algorithm incorporating missing data mechanism for Gaussian parameter estimation 
Biometrics  2014;70(2):312-322.
Missing data rates could depend on the targeted values in many settings, including mass spectrometry-based proteomic profiling studies. Here we consider mean and covariance estimation under a multivariate Gaussian distribution with non-ignorable missingness, including scenarios in which the dimension (p) of the response vector is equal to or greater than the number (n) of independent observations. A parameter estimation procedure is developed by maximizing a class of penalized likelihood functions that entails explicit modeling of missing data probabilities. The performance of the resulting ‘penalized EM algorithm incorporating missing data mechanism (PEMM)’ estimation procedure is evaluated in simulation studies and in a proteomic data illustration.
PMCID: PMC4061266  PMID: 24471933
Expectation-maximization (EM) algorithm; maximum penalized likelihood estimate; not-missing-at-random (NMAR)
2.  Learning oncogenic pathways from binary genomic instability data 
Biometrics  2011;67(1):164-173.
Genomic instability, the propensity of aberrations in chromosomes, plays a critical role in the development of many diseases. High throughput genotyping experiments have been performed to study genomic instability in diseases. The output of such experiments can be summarized as high dimensional binary vectors, where each binary variable records aberration status at one marker locus. It is of keen interest to understand how aberrations may interact with each other, as it provides insight into the process of the disease development. In this paper, we propose a novel method, LogitNet, to infer such interactions among these aberration events. The method is based on penalized logistic regression with an extension to account for spatial correlation in the genomic instability data. We conduct extensive simulation studies and show that the proposed method performs well in the situations considered. Finally, we illustrate the method using genomic instability data from breast cancer samples.
PMCID: PMC3020238  PMID: 20377578
Conditional Dependence; Graphical Model; Lasso; Loss-of-Heterozygosity; Regularized Logistic Regression

Results 1-2 (2)