PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-8 (8)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
author:("He, qinchuan")
1.  A Web-Server of Cell Type Discrimination System 
The Scientific World Journal  2014;2014:459064.
Discriminating cell types is a daily request for stem cell biologists. However, there is not a user-friendly system available to date for public users to discriminate the common cell types, embryonic stem cells (ESCs), induced pluripotent stem cells (iPSCs), and somatic cells (SCs). Here, we develop WCTDS, a web-server of cell type discrimination system, to discriminate the three cell types and their subtypes like fetal versus adult SCs. WCTDS is developed as a top layer application of our recent publication regarding cell type discriminations, which employs DNA-methylation as biomarkers and machine learning models to discriminate cell types. Implemented by Django, Python, R, and Linux shell programming, run under Linux-Apache web server, and communicated through MySQL, WCTDS provides a friendly framework to efficiently receive the user input and to run mathematical models for analyzing data and then to present results to users. This framework is flexible and easy to be expended for other applications. Therefore, WCTDS works as a user-friendly framework to discriminate cell types and subtypes and it can also be expended to detect other cell types like cancer cells.
doi:10.1155/2014/459064
PMCID: PMC3919083  PMID: 24578634
2.  A Quantitative System for Discriminating Induced Pluripotent Stem Cells, Embryonic Stem Cells and Somatic Cells 
PLoS ONE  2013;8(2):e56095.
Induced pluripotent stem cells (iPSCs) derived from somatic cells (SCs) and embryonic stem cells (ESCs) provide promising resources for regenerative medicine and medical research, leading to a daily identification of new cell lines. However, an efficient system to discriminate the different types of cell lines is lacking. Here, we develop a quantitative system to discriminate the three cell types, iPSCs, ESCs, and SCs. The system consists of DNA-methylation biomarkers and mathematical models, including an artificial neural network and support vector machines. All biomarkers were unbiasedly selected by calculating an eigengene score derived from analysis of genome-wide DNA methylations. With 30 biomarkers, or even with as few as 3 top biomarkers, this system can discriminate SCs from pluripotent cells (PCs, including ESCs and iPSCs) with almost 100% accuracy. With approximately 100 biomarkers, the system can distinguish ESCs from iPSCs with an accuracy of 95%. This robust system performs precisely with raw data without normalization as well as with converted data in which the continuous methylation levels are accounted. Strikingly, this system can even accurately predict new samples generated from different microarray platforms and the next-generation sequencing. The subtypes of cells, such as female and male iPSCs and fetal and adult SCs, can also be discriminated with this method. Thus, this novel quantitative system works as an accurate framework for discriminating the three cell types, iPSCs, ESCs, and SCs. This strategy also supports the notion that DNA-methylation generally varies among the three cell types.
doi:10.1371/journal.pone.0056095
PMCID: PMC3572019  PMID: 23418520
3.  Common genetic variation in adiponectin, leptin, and leptin receptor and association with breast cancer subtypes 
Adipocytokines are produced by visceral fat, and levels may be associated with breast cancer risk. We investigated whether single nucleotide polymorphisms (SNPs) in adipocytokine genes adiponectin (ADIPOQ), leptin (LEP), and the leptin receptor (LEPR) were associated with basal-like or luminal A breast cancer subtypes. 104 candidate and tag SNPs were genotyped in 1776 of 2022 controls and 1972 (200 basal-like, 679 luminal A) of 2311 cases from the Carolina Breast Cancer Study (CBCS), a population-based case–control study of whites and African Americans. Breast cancer molecular subtypes were determined by immunohistochemistry. Genotype odds ratios (ORs) and 95% confidence intervals (CIs) were estimated using unconditional logistic regression. Haplotype ORs and 95% CIs were estimated using Hapstat. Interactions with waist-hip ratio were evaluated using a multiplicative interaction term. Ancestry was estimated from 144 ancestry informative markers (AIMs), and included in models to control for population stratification. Candidate SNPs LEPR K109R (rs1137100) and LEPR Q223R (rs1137101) were positively associated with luminal A breast cancer, whereas ADIPOQ +45 T/G (rs2241766), ADIPOQ +276 G/T (rs1501299), and LEPR K656N (rs8129183) were not associated with either subtype. Few patterns were observed among tag SNPs, with the exception of 3 LEPR SNPs (rs17412175, rs9436746, and rs9436748) that were in moderate LD and inversely associated with basal-like breast cancer. However, no SNP associations were statistically significant after adjustment for multiple comparisons. Haplotypes in LEP and LEPR were associated with both basal-like and luminal A subtypes. There was no evidence of interaction with waist-hip ratio. Data suggest associations between LEPR candidate SNPs and luminal A breast cancer in the CBCS and LEPR intron 2 tag SNPs and basal-like breast cancer. Replication in additional studies where breast cancer subtypes have been defined is necessary to confirm these potential associations.
doi:10.1007/s10549-011-1517-z
PMCID: PMC3355661  PMID: 21516303
Adiponectin; Leptin; Leptin receptor; Breast cancer; Subtypes; Single nucleotide polymorphism
4.  Fine-Mapping and Initial Characterization of QT Interval Loci in African Americans 
PLoS Genetics  2012;8(8):e1002870.
The QT interval (QT) is heritable and its prolongation is a risk factor for ventricular tachyarrhythmias and sudden death. Most genetic studies of QT have examined European ancestral populations; however, the increased genetic diversity in African Americans provides opportunities to narrow association signals and identify population-specific variants. We therefore evaluated 6,670 SNPs spanning eleven previously identified QT loci in 8,644 African American participants from two Population Architecture using Genomics and Epidemiology (PAGE) studies: the Atherosclerosis Risk in Communities study and Women's Health Initiative Clinical Trial. Of the fifteen known independent QT variants at the eleven previously identified loci, six were significantly associated with QT in African American populations (P≤1.20×10−4): ATP1B1, PLN1, KCNQ1, NDRG4, and two NOS1AP independent signals. We also identified three population-specific signals significantly associated with QT in African Americans (P≤1.37×10−5): one at NOS1AP and two at ATP1B1. Linkage disequilibrium (LD) patterns in African Americans assisted in narrowing the region likely to contain the functional variants for several loci. For example, African American LD patterns showed that 0 SNPs were in LD with NOS1AP signal rs12143842, compared with European LD patterns that indicated 87 SNPs, which spanned 114.2 Kb, were in LD with rs12143842. Finally, bioinformatic-based characterization of the nine African American signals pointed to functional candidates located exclusively within non-coding regions, including predicted binding sites for transcription factors such as TBX5, which has been implicated in cardiac structure and conductance. In this detailed evaluation of QT loci, we identified several African Americans SNPs that better define the association with QT and successfully narrowed intervals surrounding established loci. These results demonstrate that the same loci influence variation in QT across multiple populations, that novel signals exist in African Americans, and that the SNPs identified as strong candidates for functional evaluation implicate gene regulatory dysfunction in QT prolongation.
Author Summary
The QT interval (QT) provides a measure of a ventricular action potential, and its prolongation is associated with sudden death and ventricular arrhythmias. Genome-wide association studies performed in European populations have identified common genetic variants that influence QT. However, it is unclear whether these variants are relevant in other populations, including African Americans. The increased genetic diversity in African Americans also provides opportunities to narrow association signals and identify candidates for functional evaluation. We therefore used data from 8,644 African Americans to further characterize previously identified QT loci. Of the fifteen known independent QT variants at the eleven previously identified QT loci, six were associated with QT in African Americans. We also identified three variants that were independent from previously reported signals and narrowed intervals flanking association signals using patterns of linkage disequilibrium. Finally, bioinformatic-based characterization pointed to candidates located outside protein coding regions. Our results underscore the utility of genetic studies in African ancestral populations to identify novel variants and narrow intervals surrounding established loci. These results suggest that known QT loci are important in African Americans and that further characterization of these loci in other populations may provide additional insights into the genetic and molecular mechanisms underlying QT.
doi:10.1371/journal.pgen.1002870
PMCID: PMC3415454  PMID: 22912591
5.  A variable selection method for genome-wide association studies 
Bioinformatics  2010;27(1):1-8.
Motivation: Genome-wide association studies (GWAS) involving half a million or more single nucleotide polymorphisms (SNPs) allow genetic dissection of complex diseases in a holistic manner. The common practice of analyzing one SNP at a time does not fully realize the potential of GWAS to identify multiple causal variants and to predict risk of disease. Existing methods for joint analysis of GWAS data tend to miss causal SNPs that are marginally uncorrelated with disease and have high false discovery rates (FDRs).
Results: We introduce GWASelect, a statistically powerful and computationally efficient variable selection method designed to tackle the unique challenges of GWAS data. This method searches iteratively over the potential SNPs conditional on previously selected SNPs and is thus capable of capturing causal SNPs that are marginally correlated with disease as well as those that are marginally uncorrelated with disease. A special resampling mechanism is built into the method to reduce false positive findings. Simulation studies demonstrate that the GWASelect performs well under a wide spectrum of linkage disequilibrium patterns and can be substantially more powerful than existing methods in capturing causal variants while having a lower FDR. In addition, the regression models based on the GWASelect tend to yield more accurate prediction of disease risk than existing methods. The advantages of the GWASelect are illustrated with the Wellcome Trust Case-Control Consortium (WTCCC) data.
Availability: The software implementing GWASelect is available at http://www.bios.unc.edu/~lin.
Access to WTCCC data: http://www.wtccc.org.uk/
Contact: lin@bios.unc.edu
Supplementary information: Supplementary data are available at Bioinformatics Online.
doi:10.1093/bioinformatics/btq600
PMCID: PMC3025714  PMID: 21036813
6.  A variable selection method for genome-wide association studies 
Biometrics  2011;27(1):1-8.
Motivation
Genome-wide association studies (GWAS) involving half a million or more single nucleotide polymorphisms (SNPs) allow genetic dissection of complex diseases in a holistic manner. The common practice of analyzing one SNP at a time does not fully realize the potential of GWAS to identify multiple causal variants and to predict risk of disease. Existing methods for joint analysis of GWAS data tend to miss causal SNPs that are marginally uncorrelated with disease and have high false discovery rates (FDRs).
Results
We introduce GWASelect, a statistically powerful and computationally efficient variable selection method designed to tackle the unique challenges of GWAS data. This method searches iteratively over the potential SNPs conditional on previously selected SNPs and is thus capable of capturing causal SNPs that are marginally correlated with disease as well as those that are marginally uncorrelated with disease. A special resampling mechanism is built into the method to reduce false-positive findings. Simulation studies demonstrate that the GWASelect performs well under a wide spectrum of linkage disequilibrium (LD) patterns and can be substantially more powerful than existing methods in capturing causal variants while having a lower FDR. In addition, the regression models based on the GWASelect tend to yield more accurate prediction of disease risk than existing methods. The advantages of the GWASelect are illustrated with the Wellcome Trust Case-Control Consortium (WTCCC) data.
doi:10.1093/bioinformatics/btq600
PMCID: PMC3025714  PMID: 21036813
7.  A Phenomics-Based Strategy Identifies Loci on APOC1, BRAP, and PLCG1 Associated with Metabolic Syndrome Phenotype Domains 
PLoS Genetics  2011;7(10):e1002322.
Despite evidence of the clustering of metabolic syndrome components, current approaches for identifying unifying genetic mechanisms typically evaluate clinical categories that do not provide adequate etiological information. Here, we used data from 19,486 European American and 6,287 African American Candidate Gene Association Resource Consortium participants to identify loci associated with the clustering of metabolic phenotypes. Six phenotype domains (atherogenic dyslipidemia, vascular dysfunction, vascular inflammation, pro-thrombotic state, central obesity, and elevated plasma glucose) encompassing 19 quantitative traits were examined. Principal components analysis was used to reduce the dimension of each domain such that >55% of the trait variance was represented within each domain. We then applied a statistically efficient and computational feasible multivariate approach that related eight principal components from the six domains to 250,000 imputed SNPs using an additive genetic model and including demographic covariates. In European Americans, we identified 606 genome-wide significant SNPs representing 19 loci. Many of these loci were associated with only one trait domain, were consistent with results in African Americans, and overlapped with published findings, for instance central obesity and FTO. However, our approach, which is applicable to any set of interval scale traits that is heritable and exhibits evidence of phenotypic clustering, identified three new loci in or near APOC1, BRAP, and PLCG1, which were associated with multiple phenotype domains. These pleiotropic loci may help characterize metabolic dysregulation and identify targets for intervention.
Author Summary
The metabolic syndrome represents a clustering of metabolic phenotypes (e.g. elevated blood pressure, cholesterol levels, and plasma glucose, as well as abdominal obesity) and is associated with an increased risk of atherosclerosis and type 2 diabetes. Although multiple genes influencing the specific metabolic syndrome components have been reported, few studies have evaluated the genetic underpinnings of the syndrome as a whole. Here, we describe an approach to evaluate multiple clustered traits, which allows us to test whether common genetic variants influence the co-occurrence of one or more metabolic phenotypes. By examining approximately 20,000 European American and 6,200 African American participants from five studies, we show that three regions on chromosomes 12, 19, and 20 are associated with multiple metabolic phenotypes. These genetic variants are highly intriguing candidates that may increase our understanding of the biologic basis of the clustering of metabolic phenotypes and help identify targets for early intervention.
doi:10.1371/journal.pgen.1002322
PMCID: PMC3192835  PMID: 22022282
8.  Genomewide Association for Major Depressive Disorder: A possible role for the presynaptic protein Piccolo 
Molecular psychiatry  2008;14(4):359-375.
Major depressive disorder (MDD) is a common complex trait with enormous public health significance. As part of the Genetic Association Information Network (GAIN) initiative of the US Foundation for the National Institutes of Health, we conducted a genomewide association study of 435,291 SNPs genotyped in 1,738 MDD cases and 1,802 controls selected to be at low liability for MDD. Eleven of the top 200 signals localized to a 167 kb region overlapping the gene piccolo (PCLO, whose protein product localizes to the cytomatrix of the presynaptic active zone and plays an important role in monoaminergic neurotransmission in the brain) with p-values of 7.7×10−7 for rs2715148 and 1.2×10−6 for rs2522833. We undertook replication of SNPs in this region in 5 independent samples (6,079 MDD independent cases and 5,893 controls) but no SNP exceeded the replication significance threshold when all replication samples were analyzed together. However, there was heterogeneity in the replication samples, and secondary analysis of the original sample with the sample of greatest similarity yielded p=6.4×10−8 for the non-synonymous SNP rs2522833 that gives rise to a serine to alanine substitution near a C2 calcium-binding-domain of the PCLO protein. With the integrated replication effort, we present a specific hypothesis for further studies.
doi:10.1038/mp.2008.125
PMCID: PMC2717726  PMID: 19065144
major depressive disorder; genome-wide association; Netherlands Study of Depression and Anxiety; Netherlands Twin Registry

Results 1-8 (8)