Search tips
Search criteria

Results 1-25 (25)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
1.  Evaluation of Metabolite Biomarkers for Hepatocellular Carcinoma through Stratified Analysis by Gender, Race and Alcoholic Cirrhosis 
The effects of hepatocellular carcinoma (HCC) on liver metabolism and circulating metabolites have been subjected to continuing investigation. This study compares the levels of selected metabolites in sera of HCC cases versus patients with liver cirrhosis and evaluates the influence of gender, race, and alcoholic cirrhosis on the performance of the metabolites as candidate biomarkers for HCC.
Targeted quantitation of 15 metabolites is performed by selected research monitoring (SRM) in sera from 89 Egyptian subjects (40 HCC cases and 49 cirrhotic controls) and 110 US subjects (56 HCC cases and 54 cirrhotic controls). Logistic regression models are used to evaluate the ability of these metabolites in distinguishing HCC cases from cirrhotic controls. The influences of gender, race, and alcoholic cirrhosis on the performance of the metabolites are analyzed by stratified logistic regression.
Two metabolites are selected based on their significance to both cohorts. While both metabolites discriminate HCC cases from cirrhotic controls in males and Caucasians, they are insignificant in females and African Americans. One metabolite is significant in patients with alcoholic cirrhosis and the other in non-alcoholic cirrhosis.
The study demonstrates the potential of two metabolites as candidate biomarkers for HCC by combining them with α-fetoprotein and gender. Stratified statistical analyses reveal that gender, race, and alcoholic cirrhosis affect the relative levels of small molecules in serum.
The findings of this study contribute to a better understanding of the influence of gender, race, and alcoholic cirrhosis in investigating small molecules as biomarkers for HCC.
PMCID: PMC3947117  PMID: 24186894
Mass spectrometry; metabolomics; cancer biomarker; liver cirrhosis; health disparity
2.  Pathway and Network Approaches for Identification of Cancer Signature Markers from Omics Data 
Journal of Cancer  2015;6(1):54-65.
The advancement of high throughput omic technologies during the past few years has made it possible to perform many complex assays in a much shorter time than the traditional approaches. The rapid accumulation and wide availability of omic data generated by these technologies offer great opportunities to unravel disease mechanisms, but also presents significant challenges to extract knowledge from such massive data and to evaluate the findings. To address these challenges, a number of pathway and network based approaches have been introduced. This review article evaluates these methods and discusses their application in cancer biomarker discovery using hepatocellular carcinoma (HCC) as an example.
PMCID: PMC4278915  PMID: 25553089
Biological pathways; system biology; high-throughput omics data; cancer biomarker.
3.  Construct Validity and Factor Structure of the Pittsburgh Sleep Quality Index and Epworth Sleepiness Scale in a Multi-National Study of African, South East Asian and South American College Students 
PLoS ONE  2014;9(12):e116383.
The Pittsburgh Sleep Quality Index (PSQI) and the Epworth Sleepiness Scale (ESS) are questionnaires used to assess sleep quality and excessive daytime sleepiness in clinical and population-based studies. The present study aimed to evaluate the construct validity and factor structure of the PSQI and ESS questionnaires among young adults in four countries (Chile, Ethiopia, Peru and Thailand).
A cross-sectional study was conducted among 8,481 undergraduate students. Students were invited to complete a self-administered questionnaire that collected information about lifestyle, demographic, and sleep characteristics. In each country, the construct validity and factorial structures of PSQI and ESS questionnaires were tested through exploratory and confirmatory factor analyses (EFA and CFA).
The largest component-total correlation coefficient for sleep quality as assessed using PSQI was noted in Chile (r = 0.71) while the smallest component-total correlation coefficient was noted for sleep medication use in Peru (r = 0.28). The largest component-total correlation coefficient for excessive daytime sleepiness as assessed using ESS was found for item 1 (sitting/reading) in Chile (r = 0.65) while the lowest item-total correlation was observed for item 6 (sitting and talking to someone) in Thailand (r = 0.35). Using both EFA and CFA a two-factor model was found for PSQI questionnaire in Chile, Ethiopia and Thailand while a three-factor model was found for Peru. For the ESS questionnaire, we noted two factors for all four countries
Overall, we documented cross-cultural comparability of sleep quality and excessive daytime sleepiness measures using the PSQI and ESS questionnaires among Asian, South American and African young adults. Although both the PSQI and ESS were originally developed as single-factor questionnaires, the results of our EFA and CFA revealed the multi- dimensionality of the scales suggesting limited usefulness of the global PSQI and ESS scores to assess sleep quality and excessive daytime sleepiness.
PMCID: PMC4281247  PMID: 25551586
4.  Placental Genome and Maternal-Placental Genetic Interactions: A Genome-Wide and Candidate Gene Association Study of Placental Abruption 
PLoS ONE  2014;9(12):e116346.
While available evidence supports the role of genetics in the pathogenesis of placental abruption (PA), PA-related placental genome variations and maternal-placental genetic interactions have not been investigated. Maternal blood and placental samples collected from participants in the Peruvian Abruptio Placentae Epidemiology study were genotyped using Illumina’s Cardio-Metabochip platform. We examined 118,782 genome-wide SNPs and 333 SNPs in 32 candidate genes from mitochondrial biogenesis and oxidative phosphorylation pathways in placental DNA from 280 PA cases and 244 controls. We assessed maternal-placental interactions in the candidate gene SNPS and two imprinted regions (IGF2/H19 and C19MC). Univariate and penalized logistic regression models were fit to estimate odds ratios. We examined the combined effect of multiple SNPs on PA risk using weighted genetic risk scores (WGRS) with repeated ten-fold cross-validations. A multinomial model was used to investigate maternal-placental genetic interactions. In placental genome-wide and candidate gene analyses, no SNP was significant after false discovery rate correction. The top genome-wide association study (GWAS) hits were rs544201, rs1484464 (CTNNA2), rs4149570 (TNFRSF1A) and rs13055470 (ZNRF3) (p-values: 1.11e-05 to 3.54e-05). The top 200 SNPs of the GWAS overrepresented genes involved in cell cycle, growth and proliferation. The top candidate gene hits were rs16949118 (COX10) and rs7609948 (THRB) (p-values: 6.00e-03 and 8.19e-03). Participants in the highest quartile of WGRS based on cross-validations using SNPs selected from the GWAS and candidate gene analyses had a 8.40-fold (95% CI: 5.8–12.56) and a 4.46-fold (95% CI: 2.94–6.72) higher odds of PA compared to participants in the lowest quartile. We found maternal-placental genetic interactions on PA risk for two SNPs in PPARG (chr3∶12313450 and chr3∶12412978) and maternal imprinting effects for multiple SNPs in the C19MC and IGF2/H19 regions. Variations in the placental genome and interactions between maternal-placental genetic variations may contribute to PA risk. Larger studies may help advance our understanding of PA pathogenesis.
PMCID: PMC4280220  PMID: 25549360
5.  Multi-profile Bayesian alignment model for LC-MS data analysis with integration of internal standards 
Bioinformatics  2013;29(21):2774-2780.
Motivation: Liquid chromatography-mass spectrometry (LC-MS) has been widely used for profiling expression levels of biomolecules in various ‘-omic’ studies including proteomics, metabolomics and glycomics. Appropriate LC-MS data preprocessing steps are needed to detect true differences between biological groups. Retention time (RT) alignment, which is required to ensure that ion intensity measurements among multiple LC-MS runs are comparable, is one of the most important yet challenging preprocessing steps. Current alignment approaches estimate RT variability using either single chromatograms or detected peaks, but do not simultaneously take into account the complementary information embedded in the entire LC-MS data.
Results: We propose a Bayesian alignment model for LC-MS data analysis. The alignment model provides estimates of the RT variability along with uncertainty measures. The model enables integration of multiple sources of information including internal standards and clustered chromatograms in a mathematically rigorous framework. We apply the model to LC-MS metabolomic, proteomic and glycomic data. The performance of the model is evaluated based on ground-truth data, by measuring correlation of variation, RT difference across runs and peak-matching performance. We demonstrate that Bayesian alignment model improves significantly the RT alignment performance through appropriate integration of relevant information.
Availability and implementation: MATLAB code, raw and preprocessed LC-MS data are available at
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC3799465  PMID: 24013927
6.  Daytime Sleepiness, Circadian Preference, Caffeine Consumption and Use of Other Stimulants among Thai College Students 
We conducted this study to evaluate the prevalence of daytime sleepiness and evening chronotype, and to assess the extent to which both are associated with the use of caffeinated stimulants among 3,000 Thai college students. Demographic and behavioral characteristics were collected using a self-administered questionnaire. The Epworth Sleepiness Scale and the Horne and Ostberg Morningness-Eveningness Questionnaire were used to evaluate prevalence of daytime sleepiness and circadian preference. Multivariable logistic regression models were used to evaluate the association between sleep disorders and consumption of caffeinated beverages. Overall, the prevalence of daytime sleepiness was 27.9 % (95% CI: 26.2–29.5%) while the prevalence of evening chronotype was 13% (95% CI: 11.8–14.2%). Students who use energy drinks were more likely to be evening types. For instance, the use of M100/M150 energy drinks was associated with a more than 3-fold increased odds of evening chronotype (OR 3.50; 95% CI 1.90–6.44), while Red Bull users were more than twice as likely to have evening chronotype (OR 2.39; 95% CI 1.02–5.58). Additionally, those who consumed any energy drinks were more likely to be daytime sleepers. For example, Red Bull (OR 1.72; 95% CI 1.08–2.75) or M100/M150 (OR 1.52; 95% CI 1.10–2.11) consumption was associated with increased odds of daytime sleepiness. Our findings emphasize the importance of implementing educational and prevention programs targeted toward improving sleep hygiene and reducing the consumption of energy drinks among young adults
PMCID: PMC4209847  PMID: 25356368
7.  Sleep Quality and Sleep Patterns in Relation to Consumption of Energy Drinks, Caffeinated Beverages and Other Stimulants among Thai College Students 
Sleep & breathing = Schlaf & Atmung  2012;17(3):1017-1028.
Poor sleep and heavy use of caffeinated beverages have been implicated as risk factors for a number of adverse health outcomes. Caffeine consumption and use of other stimulants are common among college students globally. However, to our knowledge, no studies have examined the influence of caffeinated beverages on sleep quality of college students in Southeast Asian populations. We conducted this study to evaluate the patterns of sleep quality; and to examine the extent to which poor sleep quality is associated with consumption of energy drinks, caffeinated beverages and other stimulants among 2,854 Thai college students.
A questionnaire was administered to ascertain demographic and behavioral characteristics. The Pittsburgh Sleep Quality Index (PSQI) was used to assess sleep habits and quality. Chi-square tests and multivariate logistic regression models were used to identify statistically significant associations.
Overall, the prevalence of poor sleep quality was found to be 48.1%. A significant percent of students used stimulant beverages (58.0%). Stimulant use (OR 1.50; 95%CI 1.28-1.77) was found to be statistically significant and positively associated with poor sleep quality. Alcohol consumption (OR 3.10; 95% CI 1.72-5.59) and cigarette smoking (OR 1.43; 95% CI 1.02-1.98) also had statistically significant association with increased daytime dysfunction. In conclusion, stimulant use is common among Thai college students and is associated with several indices of poor sleep quality.
Our findings underscore the need to educate students on the importance of sleep and the influences of dietary and lifestyle choices on their sleep quality and overall health.
PMCID: PMC3621002  PMID: 23239460
Sleep; Energy Drinks; Alcohol; Caffeine; Students; Cigarettes
8.  Daytime Sleepiness, Circadian Preference, Caffeine Consumption and Khat Use among College Students in Ethiopia 
Journal of sleep disorders-- treatment & care  2013;3(1):10.4172/2325-9639.1000130.
To estimate the prevalence of daytime sleepiness and circadian preferences, and to examine the extent to which caffeine consumption and Khat (a herbal stimulant) use are associated with daytime sleepiness and evening chronotype among Ethiopian college students.
A cross-sectional study was conducted among 2,410 college students. A self-administered questionnaire was used to collect information about sleep, behavioral risk factors such as caffeinated beverages, tobacco, alcohol, and Khat consumption. Daytime sleepiness and chronotype were assessed using the Epworth Sleepiness Scale (ESS) and the Horne & Ostberg Morningness /Eveningness Questionnaire (MEQ), respectively. Linear and logistic regression models were used to evaluate associations.
Daytime sleepiness (ESS≥10) was present in 26% of the students (95% CI: 24.4–27.8%) with 25.9% in males and 25.5% in females. A total of 30 (0.8%) students were classified as evening chronotypes (0.7% in females and 0.9% in males). Overall, Overall, Khat consumption, excessive alcohol use and cigarette smoking status were associated with evening chronotype. Use of any caffeinated beverages (OR=2.18; 95%CI: 0.82–5.77) and Khat consumption (OR=7.43; 95%CI: 3.28–16.98) increased the odds of evening chronotype.
The prevalence of daytime sleepiness among our study population was high while few were classified as evening chronotypes. We also found increased odds of evening chronotype with caffeine consumption and Khat use amongst Ethiopian college students. Prospective cohort studies that examine the effects of caffeinated beverages and Khat use on sleep disorders among young adults are needed.
PMCID: PMC4015623  PMID: 24818170
9.  Profile-Based LC-MS Data Alignment—A Bayesian Approach 
A Bayesian alignment model (BAM) is proposed for alignment of liquid chromatography-mass spectrometry (LC-MS) data. BAM belongs to the category of profile-based approaches, which are composed of two major components: a prototype function and a set of mapping functions. Appropriate estimation of these functions is crucial for good alignment results. BAM uses Markov chain Monte Carlo (MCMC) methods to draw inference on the model parameters and improves on existing MCMC-based alignment methods through 1) the implementation of an efficient MCMC sampler and 2) an adaptive selection of knots. A block Metropolis-Hastings algorithm that mitigates the problem of the MCMC sampler getting stuck at local modes of the posterior distribution is used for the update of the mapping function coefficients. In addition, a stochastic search variable selection (SSVS) methodology is used to determine the number and positions of knots. We applied BAM to a simulated data set, an LC-MS proteomic data set, and two LC-MS metabolomic data sets, and compared its performance with the Bayesian hierarchical curve registration (BHCR) model, the dynamic time-warping (DTW) model, and the continuous profile model (CPM). The advantage of applying appropriate profile-based retention time correction prior to performing a feature-based approach is also demonstrated through the metabolomic data sets.
PMCID: PMC3993096  PMID: 23929872
Alignment; Bayesian inference; block Metropolis-Hastings algorithm; liquid chromatography-mass spectrometry (LC-MS); Markov chain Monte Carlo (MCMC); stochastic search variable selection (SSVS)
10.  LC-MS Based Serum Metabolomics for Identification of Hepatocellular Carcinoma Biomarkers in Egyptian Cohort 
Journal of proteome research  2012;11(12):5914-5923.
Although hepatocellular carcinoma (HCC) has been subjected to continuous investigation and its symptoms are well known, early-stage diagnosis of this disease remains difficult and the survival rate after diagnosis is typically very low (3–5%). Early and accurate detection of metabolic changes in the sera of patients with liver cirrhosis can help improve the prognosis of HCC and lead to a better understanding of its mechanism at the molecular level, thus providing patients with in-time treatment of the disease. In this study, we compared metabolite levels in sera of 40 HCC patients and 49 cirrhosis patients from Egypt by using ultra-performance liquid chromatography coupled with quadrupole time-of-flight mass spectrometer (UPLC-QTOF MS). Following data preprocessing, the most relevant ions in distinguishing HCC cases from cirrhotic controls are selected by statistical methods. Putative metabolite identifications for these ions are obtained through mass-based database search. The identities of some of the putative identifications are verified by comparing their MS/MS fragmentation patterns and retention times with those from authentic compounds. Finally, the serum samples are reanalyzed for quantitation of selected metabolites along with other metabolites previously selected as candidate biomarkers of HCC. This quantitation was performed using isotope dilution by selected reaction monitoring (SRM) on a triple quadrupole linear ion trap (QqQLIT) coupled to UPLC. Statistical analysis of the UPLC-QTOF data identified 274 monoisotopic ion masses with statistically significant differences in ion intensities between HCC cases and cirrhotic controls. Putative identifications were obtained for 158 ions by mass based search against databases. We verified the identities of selected putative identifications including glycholic acid (GCA), glycodeoxycholic acid (GDCA), 3beta, 6beta-dihydroxy-5beta-cholan-24-oic acid, oleoyl carnitine, and Phe-Phe. SRM-based quantitation confirmed significant differences between HCC and cirrhotic controls in metabolite levels of bile acid metabolites, long chain carnitines and small peptide. Our study provides useful insight into appropriate experimental design and computational methods for serum biomarker discovery using LC-MS/MS based metabolomics. This study has led to the identification of candidate biomarkers with significant changes in metabolite levels between HCC cases and cirrhotic controls. This is the first MS-based metabolic biomarker discovery study on Egyptian subjects that led to the identification of candidate metabolites that discriminate early stage HCC from patients with liver cirrhosis.
PMCID: PMC3719870  PMID: 23078175
Hepatocellular carcinoma; liver cirrhosis; metabolic biomarker; cancer biomarker discovery; selected reaction monitoring; isotope dilution; mass spectrometry
11.  Gaussian process regression model for normalization of LC-MS data using scan-level information 
Proteome Science  2013;11(Suppl 1):S13.
Differences in sample collection, biomolecule extraction, and instrument variability introduce bias to data generated by liquid chromatography coupled with mass spectrometry (LC-MS). Normalization is used to address these issues. In this paper, we introduce a new normalization method using the Gaussian process regression model (GPRM) that utilizes information from individual scans within an extracted ion chromatogram (EIC) of a peak. The proposed method is particularly applicable for normalization based on analysis order of LC-MS runs. Our method uses measurement variabilities estimated through LC-MS data acquired from quality control samples to correct for bias caused by instrument drift. Maximum likelihood approach is used to find the optimal parameters for the fitted GPRM. We review several normalization methods and compare their performance with GPRM.
To evaluate the performance of different normalization methods, we consider LC-MS data from a study where metabolomic approach is utilized to discover biomarkers for liver cancer. The LC-MS data were acquired by analysis of sera from liver cancer patients and cirrhotic controls. In addition, LC-MS runs from a quality control (QC) sample are included to assess the run to run variability and to evaluate the ability of various normalization method in reducing this undesired variability. Also, ANOVA models are applied to the normalized LC-MS data to identify ions with intensity measurements that are significantly different between cases and controls.
One of the challenges in using label-free LC-MS for quantitation of biomolecules is systematic bias in measurements. Several normalization methods have been introduced to overcome this issue, but there is no universally applicable approach at the present time. Each data set should be carefully examined to determine the most appropriate normalization method. We review here several existing methods and introduce the GPRM for normalization of LC-MS data. Through our in-house data set, we show that the GPRM outperforms other normalization methods considered here, in terms of decreasing the variability of ion intensities among quality control runs.
PMCID: PMC3908948  PMID: 24564985
Extracted ion chromatogram (EIC); Evaluation; Gaussian process; Liquid chromatography-mass spectrometry (LC-MS); Normalization; Quality control (QC); Scan-level data
12.  Utilization of Metabolomics to Identify Serum Biomarkers for Hepatocellular Carcinoma in Patients with Liver Cirrhosis 
Analytica chimica acta  2012;743C:90-100.
Characterizing the metabolic changes pertaining to hepatocellular carcinoma (HCC) in patients with liver cirrhosis is believed to contribute towards early detection, treatment, and understanding of the molecular mechanisms of HCC. In this study, we compare metabolite levels in sera of 78 HCC cases with 184 cirrhotic controls by using ultra performance liquid chromatography coupled with a hybrid quadrupole time-of-flight mass spectrometry (UPLC-QTOF MS). Following data preprocessing, the most relevant ions in distinguishing HCC cases from patients with cirrhosis are selected by parametric and non-parametric statistical methods. Putative metabolite identifications for these ions are obtained through mass-based database search. Verification of the identities of selected metabolites is conducted by comparing their MS/MS fragmentation patterns and retention time with those from authentic compounds. Quantitation of these metabolites is performed in a subset of the serum samples (10 HCC and 10 cirrhosis) using isotope dilution by selected reaction monitoring (SRM) on triple quadrupole linear ion trap (QqQLIT) and triple quadrupole (QqQ) mass spectrometers. The results of this analysis confirm that metabolites involved in sphingolipid metabolism and phospholipid catabolism such as sphingosine-1-phosphate (S-1-P) and lysophosphatidylcholine (lysoPC 17:0) are up-regulated in sera of HCC vs. those with liver cirrhosis. Down-regulated metabolites include those involved in bile acid biosynthesis (specifically cholesterol metabolism) such as glycochenodeoxycholic acid 3-sulfate (3-sulfo-GCDCA), glycocholic acid (GCA), glycodeoxycholic acid (GDCA), taurocholic acid (TCA), and taurochenodeoxycholate (TCDCA). These results provide useful insights into HCC biomarker discovery utilizing metabolomics as an efficient and cost-effective platform. Our work shows that metabolomic profiling is a promising tool to identify candidate metabolic biomarkers for early detection of HCC cases in high risk population of cirrhotic patients.
PMCID: PMC3419576  PMID: 22882828
Metabolomics; biomarkers; liquid chromatography-mass spectrometry; hepatocellular carcinoma; selected reaction monitoring
The annals of applied statistics  2011;5(3):10.1214/11-AOAS463.
The vast amount of biological knowledge accumulated over the years has allowed researchers to identify various biochemical interactions and define different families of pathways. There is an increased interest in identifying pathways and pathway elements involved in particular biological processes. Drug discovery efforts, for example, are focused on identifying biomarkers as well as pathways related to a disease. We propose a Bayesian model that addresses this question by incorporating information on pathways and gene networks in the analysis of DNA microarray data. Such information is used to define pathway summaries, specify prior distributions, and structure the MCMC moves to fit the model. We illustrate the method with an application to gene expression data with censored survival outcomes. In addition to identifying markers that would have been missed otherwise and improving prediction accuracy, the integration of existing biological knowledge into the analysis provides a better understanding of underlying molecular processes.
PMCID: PMC3650864  PMID: 23667412
Bayesian variable selection; gene expression; Markov chain Monte Carlo; Markov random field prior; pathway selection
14.  Inflammation induces fibrinogen nitration in experimental human endotoxemia 
Free radical biology & medicine  2009;47(8):1140-1146.
Elevated plasma fibrinogen is a prothrombotic risk factor for cardiovascular disease (CVD). Recent small studies report that fibrinogen oxidative modifications, specifically tyrosine residue nitration, can occur in inflammatory states and may modify fibrinogen function. HDL cholesterol is inversely related to CVD and suggested to reduce the oxidation of LDL cholesterol, but whether these antioxidant functions extend to fibrinogen modifications is unknown. We used a recently validated ELISA to quantify nitrated fibrinogen during experimental human endotoxemia (N=23) and in a cohort of healthy adults (N=361) who were characterized for inflammatory and HDL parameters as well as subclinical atherosclerosis measures, carotid artery intima-medial thickness (IMT) and coronary artery calcification (CAC). Fibrinogen nitration increased following endotoxemia and directly correlated with accelerated ex vivo plasma clotting velocity. In the observational cohort, nitrated fibrinogen was associated with levels of CRP and serum amyloid A. Nitrated fibrinogen levels were not lower with increasing HDL cholesterol and did not associate with IMT and CAC. In humans, fibrinogen nitration was induced during inflammation and was correlated with markers of inflammation and clotting function but not HDL cholesterol or subclinical atherosclerosis in our modest sample. Inflammation-induced fibrinogen nitration may be a risk factor for promoting CVD events.
PMCID: PMC3651370  PMID: 19631267
Fibrinogen; Nitration; Intima-medial thickness; Coronary artery calcification; High-density lipoprotein (HDL)
15.  The Epidemiology of Sleep Quality, Sleep Patterns, Consumption of Caffeinated Beverages, and Khat Use among Ethiopian College Students 
Sleep Disorders  2012;2012:583510.
Objective. To evaluate sleep habits, sleep patterns, and sleep quality among Ethiopian college students; and to examine associations of poor sleep quality with consumption of caffeinated beverages and other stimulants. Methods. A total of 2,230 undergraduate students completed a self-administered comprehensive questionnaire which gathered information about sleep complaints, sociodemographic and lifestyle characteristics,and theuse of caffeinated beverages and khat. We used multivariable logistic regression procedures to estimate odds ratios for the associations of poor sleep quality with sociodemographic and behavioral factors. Results. Overall 52.7% of students were classified as having poor sleep quality (51.8% among males and 56.9% among females). In adjusted multivariate analyses, caffeine consumption (OR = 1.55; 95% CI: 1.25–1.92), cigarette smoking (OR = 1.68; 95% CI: 1.06–2.63), and khat use (OR = 1.72, 95% CI: 1.09–2.71) were all associated with increased odds of long-sleep latency (>30 minutes). Cigarette smoking (OR = 1.74; 95% CI: 1.11–2.73) and khat consumption (OR = 1.91; 95% CI: 1.22–3.00) were also significantly associated with poor sleep efficiency (<85%), as well as with increased use of sleep medicine. Conclusion. Findings from the present study demonstrate the high prevalence of poor sleep quality and its association with stimulant use among college students. Preventive and educational programs for students should include modules that emphasize the importance of sleep and associated risk factors.
PMCID: PMC3581089  PMID: 23710363
16.  Probabilistic Mixture Regression Models for Alignment of LC-MS Data 
A novel framework of a probabilistic mixture regression model (PMRM) is presented for alignment of liquid chromatography-mass spectrometry (LC-MS) data with respect to both retention time (RT) and mass-to-charge ratio (m/z). The expectation maximization algorithm is used to estimate the joint parameters of spline-based mixture regression models and prior transformation density models. The latter accounts for the variability in RT points, m/z values, and peak intensities. The applicability of PMRM for alignment of LC-MS data is demonstrated through three datasets. The performance of PMRM is compared with other alignment approaches including dynamic time warping, correlation optimized warping, and continuous profile model in terms of coefficient variation of replicate LC-MS runs and accuracy in detecting differentially abundant peptides/proteins.
PMCID: PMC3006656  PMID: 20837998
liquid chromatography; mass spectrometry; mixed-regression model; expectation-maximization
17.  Analysis of Normal-Tumour Tissue Interaction in Tumours: Prediction of Prostate Cancer Features from the Molecular Profile of Adjacent Normal Cells 
PLoS ONE  2011;6(3):e16492.
Statistical modelling, in combination with genome-wide expression profiling techniques, has demonstrated that the molecular state of the tumour is sufficient to infer its pathological state. These studies have been extremely important in diagnostics and have contributed to improving our understanding of tumour biology. However, their importance in in-depth understanding of cancer patho-physiology may be limited since they do not explicitly take into consideration the fundamental role of the tissue microenvironment in specifying tumour physiology. Because of the importance of normal cells in shaping the tissue microenvironment we formulate the hypothesis that molecular components of the profile of normal epithelial cells adjacent the tumour are predictive of tumour physiology. We addressed this hypothesis by developing statistical models that link gene expression profiles representing the molecular state of adjacent normal epithelial cells to tumour features in prostate cancer. Furthermore, network analysis showed that predictive genes are linked to the activity of important secreted factors, which have the potential to influence tumor biology, such as IL1, IGF1, PDGF BB, AGT, and TGFβ.
PMCID: PMC3068146  PMID: 21479216
18.  A Bayesian Based Functional Mixed-Effects Model for Analysis of LC-MS Data 
A Bayesian multilevel functional mixed-effects model with group specific random-effects is presented for analysis of liquid chromatography-mass spectrometry (LC-MS) data. The proposed framework allows alignment of LC-MS spectra with respect to both retention time (RT) and mass-to-charge ratio (m/z). Affine transformations are incorporated within the model to account for any variability along the RT and m/z dimensions. Simultaneous posterior inference of all unknown parameters is accomplished via Markov chain Monte Carlo method using the Gibbs sampling algorithm. The proposed approach is computationally tractable and allows incorporating prior knowledge in the inference process. We demonstrate the applicability of our approach for alignment of LC-MS spectra based on total ion count profiles derived from two LC-MS datasets.
PMCID: PMC2896560  PMID: 19963938
19.  Adipokines, Insulin Resistance and Coronary Artery Calcification 
We evaluated the hypothesis that plasma levels of adiponectin and leptin are independently but oppositely associated with coronary calcification (CAC), a measure of subclinical atherosclerosis. In addition, we assessed which biomarkers of adiposity and insulin resistance are the strongest predictors of CAC beyond traditional risk factors, the metabolic syndrome and plasma C-reactive protein (CRP).
Adipokines are fat-secreted biomolecules with pleiotropic actions that converge in diabetes and cardiovascular disease.
We examined the association of plasma adipocytokines with CAC in 860 asymptomatic, non-diabetic participants in the Study of Inherited Risk of Coronary Atherosclerosis (SIRCA).
Plasma adiponectin and leptin levels had opposite and distinct associations with adiposity, insulin resistance and inflammation. Plasma leptin was positively (top vs. bottom quartile) associated with higher CAC after adjusting for age, gender, traditional risk factors and Framingham Risk Scores (FRS) [tobit regression ratio 2.42 (95% CI 1.48–3.95, p=0.002)] and further adjusting for metabolic syndrome and CRP [ratio 2.31 (95% CI 1.36–3.94, p=0.002)]. In contrast, adiponectin levels were not associated with CAC. Comparative analyses suggested that levels of leptin, IL-6 and sol-TNFR2 as well as HOMA-IR predicted CAC scores but only leptin and HOMA-IR provided value beyond risk factors, the metabolic syndrome and CRP.
In SIRCA, while both leptin and adiponectin levels were associated with metabolic and inflammatory markers, only leptin was a significant independent predictor of CAC. Of several metabolic markers, leptin and the HOMA-IR index had the most robust, independent associations with CAC.
Condensed Abstract
Adipokines are fat-secreted biomolecules with pleiotropic actions and represent novel markers for cardiovascular risk. We examined the association of plasma adipocytokines with CAC in 860 asymptomatic, non-diabetic Caucasians. Leptin was positively (top vs. bottom quartile) associated with higher CAC even after adjustment for age, gender, traditional risk factors, Framingham Risk Score, metabolic syndrome, and CRP [ratio 2.31 (95% CI 1.36–3.94, p=0.002)]. Adiponectin levels were not associated with CAC. Comparative analyses suggested that levels of leptin, IL-6 and sol-TNFR2 as well as HOMA-IR predicted CAC scores, but only leptin and HOMA-IR provided value beyond risk factors, the metabolic syndrome and CRP.
PMCID: PMC2853595  PMID: 18617073
Adiponectin; Leptin; Coronary Artery Calcification; Atherosclerosis; Inflammation
20.  Multi-Class Alignment of LC-MS Data Using Probabilistic-Based Mixture Regression Models 
In this paper, a framework of probabilistic-based mixture regression models (PMRM) is presented for multi-class alignment of liquid chromatography-mass spectrometry (LC-MS) data. The proposed framework performs the alignment in both time and measurement spaces of the LC-MS spectra. The expectation maximization (EM) algorithm is used to estimate the joint parameters of spline-based mixture regression models and prior transformation densities. The latter are incorporated to account for variability in time and measurement spaces of the data. As a proof of concept, the proposed method is applied to align a single-class replicate LC-MS spectra generated from proteins of lysed E.coli cells. Its performance is compared with the dynamic time warping (DTW) and continuous profile model (CPM) approaches.
PMCID: PMC2714738  PMID: 19163612
21.  Singular value decomposition-based regression identifies activation of endogenous signaling pathways in vivo 
Genome Biology  2008;9(12):R180.
Singular value decomposition regression can detect the activation of endogenous signaling pathways, allowing the identification of pathway cross-talk.
The ability to detect activation of signaling pathways based solely on gene expression data represents an important goal in biological research. We tested the sensitivity of singular value decomposition-based regression by focusing on functional interactions between the Ras and transforming growth factor beta signaling pathways. Our findings demonstrate that this approach is sufficiently sensitive to detect the secondary activation of endogenous signaling pathways as it occurs through crosstalk following ectopic activation of a primary pathway.
PMCID: PMC2646284  PMID: 19094238
22.  Modeling genetic inheritance of copy number variations 
Nucleic Acids Research  2008;36(21):e138.
Copy number variations (CNVs) are being used as genetic markers or functional candidates in gene-mapping studies. However, unlike single nucleotide polymorphism or microsatellite genotyping techniques, most CNV detection methods are limited to detecting total copy numbers, rather than copy number in each of the two homologous chromosomes. To address this issue, we developed a statistical framework for intensity-based CNV detection platforms using family data. Our algorithm identifies CNVs for a family simultaneously, thus avoiding the generation of calls with Mendelian inconsistency while maintaining the ability to detect de novo CNVs. Applications to simulated data and real data indicate that our method significantly improves both call rates and accuracy of boundary inference, compared to existing approaches. We further illustrate the use of Mendelian inheritance to infer SNP allele compositions in each of the two homologous chromosomes in CNV regions using real data. Finally, we applied our method to a set of families genotyped using both the Illumina HumanHap550 and Affymetrix genome-wide 5.0 arrays to demonstrate its performance on both inherited and de novo CNVs. In conclusion, our method produces accurate CNV calls, gives probabilistic estimates of CNV transmission and builds a solid foundation for the development of linkage and association tests utilizing CNVs.
PMCID: PMC2588508  PMID: 18832372
23.  Identifying Biomarkers from Mass Spectrometry Data with Ordinal Outcome 
Cancer Informatics  2007;3:19-28.
In recent years, there has been an increased interest in using protein mass spectroscopy to identify molecular markers that discriminate diseased from healthy individuals. Existing methods are tailored towards classifying observations into nominal categories. Sometimes, however, the outcome of interest may be measured on an ordered scale. Ignoring this natural ordering results in some loss of information. In this paper, we propose a Bayesian model for the analysis of mass spectrometry data with ordered outcome. The method provides a unified approach for identifying relevant markers and predicting class membership. This is accomplished by building a stochastic search variable selection method within an ordinal outcome model. We apply the methodology to mass spectrometry data on ovarian cancer cases and healthy individuals. We also utilize wavelet-based techniques to remove noise from the mass spectra prior to analysis. We identify protein markers associated with being healthy, having low grade ovarian cancer, or being a high grade case. For comparison, we repeated the analysis using conventional classification procedures and found improved predictive accuracy with our method.
PMCID: PMC2675849  PMID: 19455232
Markov chain Monte Carlo; mass spectrometry; ordinal outcome; variable selection
24.  Unraveling gene-gene interactions regulated by ligands of the aryl hydrocarbon receptor. 
Environmental Health Perspectives  2004;112(4):403-412.
The co-expression of genes coupled to additive probabilistic relationships was used to identify gene sets predictive of the complex biological interactions regulated by ligands of the aryl hydrocarbon receptor ((Italic)Ahr(/Italic)). To maximize the number of possible gene-gene combinations, data sets from murine embryonic kidney, fetal heart, and vascular smooth muscle cells challenged (Italic)in vitro(/Italic) with ligands of the (Italic)Ahr(/Italic) were used to create predictor/training data sets. Biologically relevant gene predictor sets were calculated for (Italic)Ahr(/Italic), cytochrome P450 1B1, insulin-like growth factor-binding protein-5, lysyl oxidase, and osteopontin. Transcript levels were categorized into ternary expressions and target genes selected from the data set and tested for all possible combinations using three gene sets as predictors of transitional level. The goodness of prediction for each set was quantified using a multivariate nonlinear coefficient of determination. Evidence is presented that predictor gene combinations can be effectively used to resolve gene-gene interactions regulated by (Italic)Ahr(/Italic) ligands. (Italic)Key words:(/Italic) aryl hydrocarbon receptor, bioinformatics, gene networks, genomics. (Italic)Environ Health Perspect (/Italic)112:403-412 (2004). [Online 14 January 2004]
PMCID: PMC1241891  PMID: 15033587
25.  Pharmacogenomic characterization of gemcitabine response – a framework for data integration to enable personalized medicine 
Pharmacogenetics and Genomics  2013;24(2):81-93.
Supplemental Digital Content is available in the text.
Response to the oncology drug gemcitabine may be variable in part due to genetic differences in the enzymes and transporters responsible for its metabolism and disposition. The aim of our in-silico study was to identify gene variants significantly associated with gemcitabine response that may help to personalize treatment in the clinic.
We analyzed two independent data sets: (a) genotype data from NCI-60 cell lines using the Affymetrix DMET 1.0 platform combined with gemcitabine cytotoxicity data in those cell lines, and (b) genome-wide association studies (GWAS) data from 351 pancreatic cancer patients treated on an NCI-sponsored phase III clinical trial. We also performed a subset analysis on the GWAS data set for 135 patients who were given gemcitabine+placebo. Statistical and systems biology analyses were performed on each individual data set to identify biomarkers significantly associated with gemcitabine response.
Genetic variants in the ABC transporters (ABCC1, ABCC4) and the CYP4 family members CYP4F8 and CYP4F12, CHST3, and PPARD were found to be significant in both the NCI-60 and GWAS data sets. We report significant association between drug response and variants within members of the chondroitin sulfotransferase family (CHST) whose role in gemcitabine response is yet to be delineated.
Biomarkers identified in this integrative analysis may contribute insights into gemcitabine response variability. As genotype data become more readily available, similar studies can be conducted to gain insights into drug response mechanisms and to facilitate clinical trial design and regulatory reviews.
PMCID: PMC3888473  PMID: 24401833
DMET; gemcitabine; NCI-60; pancreatic cancer; probabilistic networks

Results 1-25 (25)