PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (36)
 

Clipboard (0)
None

Select a Filter Below

Year of Publication
author:("CHU, haitian")
1.  A prognostic signature of G₂ checkpoint function in melanoma cell lines 
Cell Cycle  2013;12(7):1071-1082.
As DNA damage checkpoints are barriers to carcinogenesis, G2 checkpoint function was quantified to test for override of this checkpoint during melanomagenesis. Primary melanocytes displayed an effective G2 checkpoint response to ionizing radiation (IR)-induced DNA damage. Thirty-seven percent of melanoma cell lines displayed a significant defect in G2 checkpoint function. Checkpoint function was melanoma subtype-specific with “epithelial-like” melanoma lines, with wild type NRAS and BRAF displaying an effective checkpoint, while lines with mutant NRAS and BRAF displayed defective checkpoint function. Expression of oncogenic B-Raf in a checkpoint-effective melanoma attenuated G2 checkpoint function significantly but modestly. Other alterations must be needed to produce the severe attenuation of G2 checkpoint function seen in some BRAF-mutant melanoma lines. Quantitative trait analysis tools identified mRNA species whose expression was correlated with G2 checkpoint function in the melanoma lines. A 165 gene signature was identified with a high correlation with checkpoint function (p < 0.004) and low false discovery rate (≤ 0.077). The G2 checkpoint gene signature predicted G2 checkpoint function with 77–94% accuracy. The signature was enriched in lysosomal genes and contained numerous genes that are associated with regulation of chromatin structure and cell cycle progression. The core machinery of the cell cycle was not altered in checkpoint-defective lines but rather numerous mediators of core machinery function were. When applied to an independent series of primary melanomas, the predictive G2 checkpoint signature was prognostic of distant metastasis-free survival. These results emphasize the value of expression profiling of primary melanomas for understanding melanoma biology and disease prognosis.
doi:10.4161/cc.24067
PMCID: PMC3646863  PMID: 23454897
G2 checkpoint; melanoma; microarray; ionizing radiation; oncogene
2.  The Bayesian Covariance Lasso 
Statistics and its interface  2013;6(2):243-259.
Estimation of sparse covariance matrices and their inverse subject to positive definiteness constraints has drawn a lot of attention in recent years. The abundance of high-dimensional data, where the sample size (n) is less than the dimension (d), requires shrinkage estimation methods since the maximum likelihood estimator is not positive definite in this case. Furthermore, when n is larger than d but not sufficiently larger, shrinkage estimation is more stable than maximum likelihood as it reduces the condition number of the precision matrix. Frequentist methods have utilized penalized likelihood methods, whereas Bayesian approaches rely on matrix decompositions or Wishart priors for shrinkage. In this paper we propose a new method, called the Bayesian Covariance Lasso (BCLASSO), for the shrinkage estimation of a precision (covariance) matrix. We consider a class of priors for the precision matrix that leads to the popular frequentist penalties as special cases, develop a Bayes estimator for the precision matrix, and propose an efficient sampling scheme that does not precalculate boundaries for positive definiteness. The proposed method is permutation invariant and performs shrinkage and estimation simultaneously for non-full rank data. Simulations show that the proposed BCLASSO performs similarly as frequentist methods for non-full rank data.
PMCID: PMC3925647  PMID: 24551316
Bayesian covariance lasso; non-full rank data; Network exploration; Penalized likelihood; Precision matrix
3.  Physical activity and maternal-fetal circulation measured by Doppler ultrasound 
Objective
To examine the association of physical activity on maternal-fetal circulation measured by uterine and umbilical artery Doppler flow velocimetry waveforms.
Study Design
Participants included 781 pregnant women with Doppler ultrasounds of the uterine and umbilical artery and who self-reported past week physical activity. Linear and generalized estimating equation regression models were used to examine these associations.
Results
Moderate-to-vigorous total and recreational activity were associated with higher uterine artery pulsatility index (PI) and an increased risk of uterine artery notching as compared to reporting no total or recreational physical activity, respectively. Moderate-to-vigorous work activity was associated with lower uterine artery PI and a reduced risk of uterine artery notching as compared to no work activity. No associations were identified with the umbilical circulation measured by the resistance index.
Conclusion
In this epidemiologic study, recreational and work activity were associated with opposite effects on uterine artery PI and uterine artery notching, though associations were modest in magnitude.
doi:10.1038/jp.2012.68
PMCID: PMC3459289  PMID: 22678142
work; recreational activity; maternal-fetal blood flow; pregnancy; Doppler flow velocimetry waveforms; preeclampsia
4.  Comparison of Viral Env Proteins from Acute and Chronic Infections with Subtype C Human Immunodeficiency Virus Type 1 Identifies Differences in Glycosylation and CCR5 Utilization and Suggests a New Strategy for Immunogen Design 
Journal of Virology  2013;87(13):7218-7233.
Understanding human immunodeficiency virus type 1 (HIV-1) transmission is central to developing effective prevention strategies, including a vaccine. We compared phenotypic and genetic variation in HIV-1 env genes from subjects in acute/early infection and subjects with chronic infections in the context of subtype C heterosexual transmission. We found that the transmitted viruses all used CCR5 and required high levels of CD4 to infect target cells, suggesting selection for replication in T cells and not macrophages after transmission. In addition, the transmitted viruses were more likely to use a maraviroc-sensitive conformation of CCR5, perhaps identifying a feature of the target T cell. We confirmed an earlier observation that the transmitted viruses were, on average, modestly underglycosylated relative to the viruses from chronically infected subjects. This difference was most pronounced in comparing the viruses in acutely infected men to those in chronically infected women. These features of the transmitted virus point to selective pressures during the transmission event. We did not observe a consistent difference either in heterologous neutralization sensitivity or in sensitivity to soluble CD4 between the two groups, suggesting similar conformations between viruses from acute and chronic infection. However, the presence or absence of glycosylation sites had differential effects on neutralization sensitivity for different antibodies. We suggest that the occasional absence of glycosylation sites encoded in the conserved regions of env, further reduced in transmitted viruses, could expose specific surface structures on the protein as antibody targets.
doi:10.1128/JVI.03577-12
PMCID: PMC3700278  PMID: 23616655
5.  Bivariate Random Effects Models for Meta-Analysis of Comparative Studies with Binary Outcomes: Methods for the Absolute Risk Difference and Relative Risk 
Multivariate meta-analysis is increasingly utilized in biomedical research to combine data of multiple comparative clinical studies for evaluating drug efficacy and safety profile. When the probability of the event of interest is rare or when the individual study sample sizes are small, a substantial proportion of studies may not have any event of interest. Conventional meta-analysis methods either exclude such studies or include them through ad-hoc continuality correction by adding an arbitrary positive value to each cell of the corresponding 2 by 2 tables, which may result in less accurate conclusions. Furthermore, different continuity corrections may result in inconsistent conclusions. In this article, we discuss a bivariate Beta-binomial model derived from Sarmanov family of bivariate distributions and a bivariate generalized linear mixed effects model for binary clustered data to make valid inferences. These bivariate random effects models use all available data without ad hoc continuity corrections, and accounts for the potential correlation between treatment (or exposure) and control groups within studies naturally. We then utilize the bivariate random effects models to reanalyze two recent meta-analysis data sets.
doi:10.1177/0962280210393712
PMCID: PMC3348438  PMID: 21177306
clustered binary data; bivariate random effects models; Beta-binomial distribution; meta-analysis; bivariate generalized linear mixed models
6.  Missing Data in Clinical Studies: Issues and Methods 
Journal of Clinical Oncology  2012;30(26):3297-3303.
Missing data are a prevailing problem in any type of data analyses. A participant variable is considered missing if the value of the variable (outcome or covariate) for the participant is not observed. In this article, various issues in analyzing studies with missing data are discussed. Particularly, we focus on missing response and/or covariate data for studies with discrete, continuous, or time-to-event end points in which generalized linear models, models for longitudinal data such as generalized linear mixed effects models, or Cox regression models are used. We discuss various classifications of missing data that may arise in a study and demonstrate in several situations that the commonly used method of throwing out all participants with any missing data may lead to incorrect results and conclusions. The methods described are applied to data from an Eastern Cooperative Oncology Group phase II clinical trial of liver cancer and a phase III clinical trial of advanced non–small-cell lung cancer. Although the main area of application discussed here is cancer, the issues and methods we discuss apply to any type of study.
doi:10.1200/JCO.2011.38.7589
PMCID: PMC3948388  PMID: 22649133
7.  The Effect of HAART on HIV RNA Trajectory Among Treatment Naïve Men and Women: a Segmental Bernoulli/Lognormal Random Effects Model with Left Censoring 
Epidemiology (Cambridge, Mass.)  2010;21(0 4):S25-S34.
Background
Highly active antiretroviral therapy (HAART) rapidly suppresses human immunodeficiency virus (HIV) viral replication and reduces circulating viral load, but the long-term effects of HAART on viral load remain unclear.
Methods
We evaluated HIV viral load trajectories over 8 years following HAART initiation in the Multicenter AIDS Cohort Study and the Women’s Interagency HIV Study. The study included 157 HIV-infected men and 199 HIV-infected women who were antiretroviral naïve and contributed 1311 and 1837 semiannual person-visits post-HAART, respectively. To account for within-subject correlation and the high proportion of left-censored viral loads, we used a segmental Bernoulli/lognormal random effects model.
Results
Approximately 3 months (0.30 years for men and 0.22 years for women) after HAART initiation, HIV viral loads were optimally suppressed (ie, with very low HIV RNA) for 44% (95% confidence interval = 39%–49%) of men and 43% (38%–47%) of women, whereas the other 56% of men and 57% of women had on average 2.1 (1.5–2.6) and 3.0 (2.7–3.2) log10 copies/mL, respectively.
Conclusion
After 8 years on HAART, 75% of men and 80% of women had optimal suppression, whereas the rest of the men and women had suboptimal suppression with a median HIV RNA of 3.1 and 3.7 log10 copies/mL, respectively.
doi:10.1097/EDE.0b013e3181ce9950
PMCID: PMC3736572  PMID: 20386106
8.  A prognostic signature of defective p53-dependent G1 checkpoint function in melanoma cell lines 
Pigment cell & melanoma research  2012;25(4):514-526.
Summary
Melanoma cell lines and normal human melanocytes were assayed for p53-dependent G1 checkpoint response to ionizing radiation-induced DNA damage. Sixty six percent of melanoma cell lines displayed a defective G1 checkpoint. Checkpoint function was correlated with sensitivity to ionizing radiation with checkpoint-defective lines being radio-resistant. Microarray analysis identified 316 probes whose expression was correlated with G1 checkpoint function in melanoma lines (P≤0.007) including p53 transactivation targets CDKN1A, DDB2 and RRM2B. The 316 probe list predicted G1 checkpoint function of the melanoma lines with 86% accuracy using a binary analysis and 91% accuracy using a continuous analysis. When applied to microarray data from primary melanomas, the 316 probe list was prognostic of four year distant metastases-free survival. Thus, p53 function, radio-sensitivity and metastatic spread may be estimated in melanomas from a signature of gene expression.
doi:10.1111/j.1755-148X.2012.01010.x
PMCID: PMC3397470  PMID: 22540896
gene; expression; signature; p53; function; checkpoint; melanoma
9.  Bayesian Analysis on Meta-analysis of Case-control Studies Accounting for Within-study Correlation 
Statistical methods in medical research  2011;10.1177/0962280211430889.
In retrospective studies, odds ratio is often used as the measure of association. Under independent beta prior assumption, the exact posterior distribution of odds ratio given a single 2 × 2 table has been derived in the literature. However, independence between risks within the same study may be an oversimplified assumption because cases and controls in the same study are likely to share some common factors and thus to be correlated. Furthermore, in a meta-analysis of case-control studies, investigators usually have multiple 2×2 tables. In this paper, we first extend the published results on a single 2×2 table to allow within study prior correlation while retaining the advantage of closed form posterior formula, and then extend the results to multiple 2 × 2 tables and regression setting. The hyperparameters, including within study correlation, are estimated via an empirical Bayes approach. The overall odds ratio and the exact posterior distribution of the study-specific odds ratio are inferred based on the estimated hyperparameters. We conduct simulation studies to verify our exact posterior distribution formulas and investigate the finite sample properties of the inference for the overall odds ratio. The results are illustrated through a twin study for genetic heritability and a meta-analysis for the association between the N-acetyltransferase 2 (NAT2) acetylation status and colorectal cancer.
doi:10.1177/0962280211430889
PMCID: PMC3683108  PMID: 22143403
Bivariate beta-binomial model; Exact method; Hypergeometric function; Meta-analysis; Odds ratio; Sarmanov family
10.  Bayesian Posterior Distributions Without Markov Chains 
American Journal of Epidemiology  2012;175(5):368-375.
Bayesian posterior parameter distributions are often simulated using Markov chain Monte Carlo (MCMC) methods. However, MCMC methods are not always necessary and do not help the uninitiated understand Bayesian inference. As a bridge to understanding Bayesian inference, the authors illustrate a transparent rejection sampling method. In example 1, they illustrate rejection sampling using 36 cases and 198 controls from a case-control study (1976–1983) assessing the relation between residential exposure to magnetic fields and the development of childhood cancer. Results from rejection sampling (odds ratio (OR) = 1.69, 95% posterior interval (PI): 0.57, 5.00) were similar to MCMC results (OR = 1.69, 95% PI: 0.58, 4.95) and approximations from data-augmentation priors (OR = 1.74, 95% PI: 0.60, 5.06). In example 2, the authors apply rejection sampling to a cohort study of 315 human immunodeficiency virus seroconverters (1984–1998) to assess the relation between viral load after infection and 5-year incidence of acquired immunodeficiency syndrome, adjusting for (continuous) age at seroconversion and race. In this more complex example, rejection sampling required a notably longer run time than MCMC sampling but remained feasible and again yielded similar results. The transparency of the proposed approach comes at a price of being less broadly applicable than MCMC.
doi:10.1093/aje/kwr433
PMCID: PMC3282880  PMID: 22306565
Bayes theorem; epidemiologic methods; inference; Monte Carlo method; posterior distribution; simulation
11.  Performance of rapid influenza H1N1 diagnostic tests: a meta-analysis 
Background
Following the outbreaks of 2009 pandemic H1N1 infection, rapid influenza diagnostic tests have been used to detect H1N1 infection. However, no meta-analysis has been undertaken to assess the diagnostic accuracy when this manuscript was drafted.
Methods
The literature was systematically searched to identify studies that reported the performance of rapid tests. Random effects meta-analyses were conducted to summarize the overall performance.
Results
Seventeen studies were selected with 1879 cases and 3477 non-cases. The overall sensitivity and specificity estimates of the rapid tests were 0.51 (95%CI: 0.41, 0.60) and 0.98 (95%CI: 0.94, 0.99). Studies reported heterogeneous sensitivity estimates, ranging from 0.11 to 0.88. If the prevalence was 30%, the overall positive and negative predictive values were 0.94 (95%CI: 0.85, 0.98) and 0.82 (95%CI: 0.79, 0.85). The overall specificities from different manufacturers were comparable, while there were some differences for the overall sensitivity estimates. BinaxNOW had a lower overall sensitivity of 0.39 (95%CI: 0.24, 0.57) compared to all the others (p-value < 0.001), whereas QuickVue had a higher overall sensitivity of 0.57 (95%CI: 0.50, 0.63) compared to all the others (p-value = 0.005).
Conclusions
Rapid tests have high specificity but low sensitivity and thus limited usefulness.
doi:10.1111/j.1750-2659.2011.00284.x
PMCID: PMC3288365  PMID: 21883964
meta analysis; H1N1; diagnostic tests; rapid tests; sensitivity and specificity
12.  Efficacy of NNRTI-Based Antiretroviral Therapy Initiated During Acute HIV Infection 
AIDS (London, England)  2011;25(7):941-949.
Objective
Characterize responses to a NNRTI-based antiretroviral treatment (ART) initiated during acute HIV infection (AHI).
Design
This was a prospective, single-arm evaluation of once daily, co-formulated emtricitabine/tenofovir/efavirenz initiated during AHI.
Methods
The primary endpoint is the proportion of responders with HIV RNA <200 copies/mL by week 24. We examined time-to-viral-suppression and CD8 cell activation in relation to baseline participant characteristics. We compared time-to-viral-suppression and viral dynamics using linear mixed effects models between acutely infected participants and chronically-infected controls.
Results
Between January 2005 and May 2009, 61 AHI participants were enrolled. Of participants whose enrollment date allowed 24 and 48 weeks of follow-up, 47 of 51 (92%) achieved viral suppression to <200 copies/mL by week 24, and 35 of 41 (85.4%) to <50 copies/mL by week 48. The median time from ART initiation to suppression <50 copies/mL was 93 days (range 14–337). Higher HIV RNA levels at ART initiation (p=0.02), but not time from estimated-date-of-infection to ART initiation (p=0.86), were associated with longer time-to-viral-suppression. The median baseline frequency of activated CD8+CD38+HLA-DR+ T-cells was 67% (range 40–95), and was not significantly associated with longer time to viral load suppression (p=0.15). Viremia declined to <50 copies/mL more rapidly in AHI than chronically-infected participants. Mixed model analysis demonstrated similar phase I HIV RNA decay rates between acute and chronically-infected participants, and more rapid viral decline in acutely-infected participants in phase II.
Conclusion
Once daily emtricitabine/tenofovir/efavirenz initiated during AHI achieves rapid and sustained HIV suppression during this highly infectious period.
doi:10.1097/QAD.0b013e3283463c07
PMCID: PMC3569481  PMID: 21487250
Acute HIV infection; NNRTIs; antiretroviral therapy; immune activation; viral dynamics
13.  Maximum likelihood estimation in generalized linear models with multiple covariates subject to detection limits 
Statistics in medicine  2011;10.1002/sim.4280.
The analysis of data subject to detection limits is becoming increasingly necessary in many environmental and laboratory studies. Covariates subject to detection limits are often left censored because of a measurement device having a minimal lower limit of detection. In this paper, we propose a Monte Carlo version of the expectation–maximization algorithm to handle large number of covariates subject to detection limits in generalized linear models. We model the covariate distribution via a sequence of one-dimensional conditional distributions, and sample the covariate values using an adaptive rejection metropolis algorithm. Parameter estimation is obtained by maximization via the Monte Carlo M-step. This procedure is applied to a real dataset from the National Health and Nutrition Examination Survey, in which values of urinary heavy metals are subject to a limit of detection. Through simulation studies, we show that the proposed approach can lead to a significant reduction in variance for parameter estimates in these models, improving the power of such studies.
doi:10.1002/sim.4280
PMCID: PMC3375355  PMID: 21710558
EM algorithm; Gibbs sampling; logistic regression; maximum likelihood estimation; Monte Carlo EM; NHANES
14.  Lagging Exposure Information in Cumulative Exposure-Response Analyses 
American Journal of Epidemiology  2011;174(12):1416-1422.
Lagging exposure information is often undertaken to allow for a latency period in cumulative exposure-disease analyses. The authors first consider bias and confidence interval coverage when using the standard approaches of fitting models under several lag assumptions and selecting the lag that maximizes either the effect estimate or model goodness of fit. Next, they consider bias that occurs when the assumption that the latency period is a fixed constant does not hold. Expressions were derived for bias due to misspecification of lag assumptions, and simulations were conducted. Finally, the authors describe a method for joint estimation of parameters describing an exposure-response association and the latency distribution. Analyses of associations between cumulative asbestos exposure and lung cancer mortality among textile workers illustrate this approach. Selecting the lag that maximizes the effect estimate may lead to bias away from the null; selecting the lag that maximizes model goodness of fit may lead to confidence intervals that are too narrow. These problems tend to increase as the within-person exposure variation diminishes. Lagging exposure assignment by a constant will lead to bias toward the null if the distribution of latency periods is not a fixed constant. Direct estimation of latency periods can minimize bias and improve confidence interval coverage.
doi:10.1093/aje/kwr260
PMCID: PMC3276301  PMID: 22047823
asbestos; cohort studies; latency; neoplasms; survival analysis
15.  Bayesian methods in clinical trials: a Bayesian analysis of ECOG trials E1684 and E1690 
Background
E1684 was the pivotal adjuvant melanoma trial for establishment of high-dose interferon (IFN) as effective therapy of high-risk melanoma patients. E1690 was an intriguing effort to corroborate E1684, and the differences between the outcomes of these trials have embroiled the field in controversy over the past several years. The analyses of E1684 and E1690 were carried out separately when the results were published, and there were no further analyses trying to perform a single analysis of the combined trials.
Method
In this paper, we consider such a joint analysis by carrying out a Bayesian analysis of these two trials, thus providing us with a consistent and coherent methodology for combining the results from these two trials.
Results
The Bayesian analysis using power priors provided a more coherent flexible and potentially more accurate analysis than a separate analysis of these data or a frequentist analysis of these data. The methodology provides a consistent framework for carrying out a single unified analysis by combining data from two or more studies.
Conclusions
Such Bayesian analyses can be crucial in situations where the results from two theoretically identical trials yield somewhat conflicting or inconsistent results.
doi:10.1186/1471-2288-12-183
PMCID: PMC3571975  PMID: 23194570
Cure rate model; Historical data; Prior distribution; Posterior distribution
16.  Sample size and power determination in joint modeling of longitudinal and survival data 
Statistics in Medicine  2011;30(18):2295-2309.
Owing to the rapid development of biomarkers in clinical trials, joint modeling of longitudinal and survival data has gained its popularity in the recent years because it reduces bias and provides improvements of efficiency in the assessment of treatment effects and other prognostic factors. Although much effort has been put into inferential methods in joint modeling, such as estimation and hypothesis testing, design aspects have not been formally considered. Statistical design, such as sample size and power calculations, is a crucial first step in clinical trials. In this paper, we derive a closed-form sample size formula for estimating the effect of the longitudinal process in joint modeling, and extend Schoenfeld’s sample size formula to the joint modeling setting for estimating the overall treatment effect. The sample size formula we develop is quite general, allowing for p-degree polynomial trajectories. The robustness of our model is demonstrated in simulation studies with linear and quadratic trajectories. We discuss the impact of the within-subject variability on power and data collection strategies, such as spacing and frequency of repeated measurements, in order to maximize the power. When the within-subject variability is large, different data collection strategies can influence the power of the study in a significant way. Optimal frequency of repeated measurements also depends on the nature of the trajectory with higher polynomial trajectories and larger measurement error requiring more frequent measurements.
doi:10.1002/sim.4263
PMCID: PMC3278672  PMID: 21590793
sample size; power determination; joint modeling; survival analysis; longitudinal data; repeated measurements
17.  DNA Methylation Profiling Distinguishes Malignant Melanomas from Benign Nevi 
Pigment cell & melanoma research  2011;24(2):352-360.
Summary
DNA methylation, an epigenetic alteration typically occurring early in cancer development, could aid in the molecular diagnosis of melanoma. We determined technical feasibility for high-throughput DNA-methylation array-based profiling using formalin-fixed paraffin-embedded tissues for selection of candidate DNA-methylation differences between melanomas and nevi. Promoter methylation was evaluated in 27 common benign nevi and 22 primary invasive melanomas using a 1505 CpG-site microarray. Unsupervised hierarchical clustering distinguished melanomas from nevi; and 26 CpG sites in 22 genes were identified with significantly different methylation levels between melanomas and nevi after adjustment for age, sex, and multiple comparisons and with β-value differences of ≥ 0.2. Prediction Analysis for Microarrays identified 12 CpG loci that were highly predictive of melanoma, with area under the receiver operating characteristic curves of greater than 0.95. Of our panel of 22 genes, 14 were statistically significant in an independent sample set of 29 nevi (including dysplastic nevi) and 25 primary invasive melanomas after adjustment for age, sex, and multiple comparisons. This first report of a DNA-methylation signature discriminating melanomas from nevi indicates that DNA methylation appears promising as an additional tool for enhancing melanoma diagnosis.
doi:10.1111/j.1755-148X.2011.00828.x
PMCID: PMC3073305  PMID: 21375697
melanoma; nevi; methylation profiling; diagnostic markers
18.  Linear Regression with an Independent Variable Subject to a Detection Limit 
Epidemiology (Cambridge, Mass.)  2010;21(Suppl 4):S17-S24.
Background
Linear regression with a left-censored independent variable X due to limit of detection (LOD) was recently considered by 2 groups of researchers: Richardson and Ciampi, and Schisterman and colleagues.
Methods
Both groups obtained consistent estimators for the regression slopes by replacing left-censored X with a constant, that is, the expectation of X given X below LOD E(X|X
Results
Schisterman and colleagues argued that their approach would be a better choice because the sample mean of X given X above LOD is available, whereas E(X|X
Conclusion
Recommendations are given based on theoretical and simulation results. These recommendations are illustrated with 1 case study.
doi:10.1097/EDE.0b013e3181ce97d8
PMCID: PMC3265361  PMID: 21422965
Bioinformatics  2010;26(22):2849-2855.
Motivation: The Illumina BeadArray is a popular platform for profiling DNA methylation, an important epigenetic event associated with gene silencing and chromosomal instability. However, current approaches rely on an arbitrary detection P-value cutoff for excluding probes and samples from subsequent analysis as a quality control step, which results in missing observations and information loss. It is desirable to have an approach that incorporates the whole data, but accounts for the different quality of individual observations.
Results: We first investigate and propose a statistical framework for removing the source of biases in Illumina Methylation BeadArray based on several positive control samples. We then introduce a weighted model-based clustering called LumiWCluster for Illumina BeadArray that weights each observation according to the detection P-values systematically and avoids discarding subsets of the data. LumiWCluster allows for discovery of distinct methylation patterns and automatic selection of informative CpG loci. We demonstrate the advantages of LumiWCluster on two publicly available Illumina GoldenGate Methylation datasets (ovarian cancer and hepatocellular carcinoma).
Availability: R package LumiWCluster can be downloaded from http://www.unc.edu/~pfkuan/LumiWCluster
Contact: pfkuan@bios.unc.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btq553
PMCID: PMC3025715  PMID: 20880956
BMJ Open  2011;1(2):e000156.
Background
Treatment effect is traditionally assessed through either superiority or non-inferiority clinical trials. Investigators may find that because of safety concerns and/or wide variability across strata of the superiority margin of active controls over placebo, neither a superiority nor a non-inferiority trial design is ethical or practical in some disease populations. Prior knowledge may allow and drive study designers to consider more sophisticated designs for a clinical trial.
Design
In this paper, the authors propose hybrid designs which may combine a superiority design in one subgroup with a non-inferiority design in another subgroup or combine designs with different control regimens in different subgroups in one trial when a uniform design is unethical or impractical. The authors show how the hybrid design can be planned and how inferences can be made. Through two examples, the authors illustrate the scenarios where hybrid designs are useful while the conventional designs are not preferable.
Conclusion
The hybrid design is a useful alternative to current superiority and non-inferiority designs.
Article summary
Article focus
We propose hybrid designs for the trials when neither a superiority nor a non-inferiority trial design is ethical and practical.
Key messages
The hybrid design is practical, flexible and feasible.
We expect it to become a major alternative to the superiority and non-inferiority designs.
Strengths and limitations of this study
Hybrid design provides a powerful and relatively simple solution to the difficult problem of active controls with varying efficacy and/or safety concern. The problem is becoming more common as more drugs become available.
The design and analysis are moderately complex compared with the superiority and non-inferiority designs.
doi:10.1136/bmjopen-2011-000156
PMCID: PMC3191591  PMID: 22021876
Bivariate random effect models are currently one of the main methods recommended to synthesize diagnostic test accuracy studies. However, only the logit-transformation on sensitivity and specificity has been previously considered in the literature. In this paper, we consider a bivariate generalized linear mixed model to jointly model the sensitivities and specificities, and discuss the estimation of the summary receiver operating characteristic curve (ROC) and the area under the ROC curve (AUC). As the special cases of this model, we discuss the commonly used logit, probit and complementary log-log transformations. To evaluate the impact of misspecification of the link functions on the estimation, we present two case studies and a set of simulation studies. Our study suggests that point estimation of the median sensitivity and specificity, and AUC is relatively robust to the misspecification of the link functions. However, the misspecification of link functions has a noticeable impact on the standard error estimation and the 95% confidence interval coverage, which emphasizes the importance of choosing an appropriate link function to make statistical inference.
doi:10.1177/0272989X09353452
PMCID: PMC3035476  PMID: 19959794
meta-analysis; bivariate random effect models; sensitivity; specificity; receiver operating characteristic curve; area under the ROC curve
Oncology  2010;78(3-4):181-188.
Purpose
NF-κB is an antiapoptotic transcription factor that has been shown to be a mediator of treatment resistance. Bcl-3 is a regulator of NF-κB that may play a role in oncogenesis. The goal of this study was to correlate the activation status of NF-κB and Bcl-3 with clinical outcome in a group of patients with metastatic colorectal cancer (CRC).
Methods
A retrospective study of 23 patients who underwent surgical resection of CRC at the University of North Carolina (UNC). Activation of NF-κB was defined by nuclear expression of select components of NF-κB (p50, p52, p65) and Bcl-3. Tissue microarrays were created from cores of normal mucosa, primary tumor, lymph node metastases and liver metastases in triplicate from disparate areas of the blocks, and an intensity score was generated by multiplying intensity (0–3+) by percent of positive tumor cells. Generalized estimating equations were used to note differences in intensity scores among normal mucosa and nonnormal tissues. Cox regression models were fit to see if scores were significantly associated with overall survival.
Results
p65 NE was significantly higher in primary tumor and liver metastases than normal mucosa (both p < 0.01). p50 nuclear expression was significantly higher for all tumor sites than for normal mucosa (primary tumor and lymph node metastases p < 0.0001, liver metastases p < 0.01). Bcl-3 nuclear expression did not differ significantly between normal mucosa and tumor; however, nuclear expression in primary tumor for each of these components was strongly associated with survival: the increase in hazard for each 50-point increase in nuclear expression was 91% for Bcl-3, 66% for p65, and 52% for p50 (all p < 0.05).
Conclusions
Activation of canonical NF-κB subunits p50 and p65 as measured by nuclear expression is strongly associated with survival suggesting NF-κB as a prognostic factor in this disease. Primary tumor nuclear expression appears to be as good as, or better than, metastatic sites at predicting prognosis. Bcl-3 nuclear expression is also negatively associated with survival and deserves further study in CRC.
doi:10.1159/000313697
PMCID: PMC2914399  PMID: 20414006
NF-κB; P65; P50; Colorectal carcinoma
Statistics in medicine  2010;29(11):1206-1218.
Summary
To evaluate the probabilities of a disease state, ideally all subjects in a study should be diagnosed by a definitive diagnostic or gold standard test. However, since definitive diagnostic tests are often invasive and expensive, it is generally unethical to apply them to subjects whose screening tests are negative. In this article, we consider latent class models for screening studies with two imperfect binary diagnostic tests and a definitive categorical disease status measured only for those with at least one positive screening test. Specifically, we discuss a conditional independent and three homogeneous conditional dependent latent class models and assess the impact of misspecification of the dependence structure on the estimation of disease category probabilities using frequentist and Bayesian approaches. Interestingly, the three homogeneous dependent models can provide identical goodness-of-fit but substantively different estimates for a given study. However, the parametric form of the assumed dependence structure itself is not “testable” from the data, and thus the dependence structure modeling considered here can only be viewed as a sensitivity analysis concerning a more complicated non-identifiable model potentially involving heterogeneous dependence structure. Furthermore, we discuss Bayesian model averaging together with its limitations as an alternative way to partially address this particularly challenging problem. The methods are applied to two cancer screening studies, and simulations are conducted to evaluate the performance of these methods. In summary, further research is needed to reduce the impact of model misspecification on the estimation of disease prevalence in such settings.
doi:10.1002/sim.3862
PMCID: PMC2879599  PMID: 20191614
maximum likelihood; Bayesian inference; diagnostic test; dependence; screening; latent class models
That conditioning on a common effect of exposure and outcome may cause selection, or collider-stratification, bias is not intuitive. We provide two hypothetical examples to convey concepts underlying bias due to conditioning on a collider. In the first example, fever is a common effect of influenza and consumption of a tainted egg-salad sandwich. In the second example, case-status is a common effect of a genotype and an environmental factor. In both examples, conditioning on the common effect imparts an association between two otherwise independent variables; we call this selection bias.
doi:10.1093/ije/dyp334
PMCID: PMC2846442  PMID: 19926667
Bias; selection; methods; epidemiologic
Statistics in medicine  2009;28(26):3276-3293.
SUMMARY
In the survival analysis context, when an intervention either reduces a harmful exposure or introduces a beneficial treatment, it seems useful to quantify the gain in survival attributable to the intervention as an alternative to the reduction in risk. To accomplish this we introduce two new concepts, the attributable survival and attributable survival time, and study their properties. Our analysis includes comparison with the attributable risk function as well as hazard-based alternatives. We also extend the setting to the case where the intervention takes place at discrete points in time, and may either eliminate exposure or introduce a beneficial treatment in only a proportion of the available group. This generalization accommodates the more realistic situation where the treatment or exposure is dynamic. We apply these methods to assess the effect of introducing highly active antiretroviral therapy for the treatment of clinical AIDS at the population level.
doi:10.1002/sim.3705
PMCID: PMC3057448  PMID: 19697303
attributable risk function; survival analysis; parametric models; generalized gamma distribution; product limit estimate

Results 1-25 (36)