PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-7 (7)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  Identification of Molecular Pathway Aberrations in Uterine Serous Carcinoma by Genome-wide Analyses 
Background
Uterine cancer is the fourth most common malignancy in women, and uterine serous carcinoma is the most aggressive subtype. However, the molecular pathogenesis of uterine serous carcinoma is largely unknown. We analyzed the genomes of uterine serous carcinoma samples to better understand the molecular genetic characteristics of this cancer.
Methods
Whole-exome sequencing was performed on 10 uterine serous carcinomas and the matched normal blood or tissue samples. Somatically acquired sequence mutations were further verified by Sanger sequencing. The most frequent molecular genetic changes were further validated by Sanger sequencing in 66 additional uterine serous carcinomas and in nine serous endometrial intraepithelial carcinomas (the preinvasive precursor of uterine serous carcinoma) that were isolated by laser capture microdissection. In addition, gene copy number was characterized by single-nucleotide polymorphism (SNP) arrays in 23 uterine serous carcinomas, including 10 that were subjected to whole-exome sequencing.
Results
We found frequent somatic mutations in TP53 (81.6%), PIK3CA (23.7%), FBXW7 (19.7%), and PPP2R1A (18.4%) among the 76 uterine serous carcinomas examined. All nine serous carcinomas that had an associated serous endometrial intraepithelial carcinoma had concordant PIK3CA, PPP2R1A, and TP53 mutation status between uterine serous carcinoma and the concurrent serous endometrial intraepithelial carcinoma component. DNA copy number analysis revealed frequent genomic amplification of the CCNE1 locus (which encodes cyclin E, a known substrate of FBXW7) and deletion of the FBXW7 locus. Among 23 uterine serous carcinomas that were subjected to SNP array analysis, seven tumors with FBXW7 mutations (four tumors with point mutations, three tumors with hemizygous deletions) did not have CCNE1 amplification, and 13 (57%) tumors had either a molecular genetic alteration in FBXW7 or CCNE1 amplification. Nearly half of these uterine serous carcinomas (48%) harbored PIK3CA mutation and/or PIK3CA amplification.
Conclusion
Molecular genetic aberrations involving the p53, cyclin E–FBXW7, and PI3K pathways represent major mechanisms in the development of uterine serous carcinoma.
doi:10.1093/jnci/djs345
PMCID: PMC3692380  PMID: 22923510
2.  An Overview of Population Genetic Data Simulation 
Abstract
Simulation studies in population genetics play an important role in helping to better understand the impact of various evolutionary and demographic scenarios on sequence variation and sequence patterns, and they also permit investigators to better assess and design analytical methods in the study of disease-associated genetic factors. To facilitate these studies, it is imperative to develop simulators with the capability to accurately generate complex genomic data under various genetic models. Currently, a number of efficient simulation software packages for large-scale genomic data are available, and new simulation programs with more sophisticated capabilities and features continue to emerge. In this article, we review the three basic simulation frameworks—coalescent, forward, and resampling—and some of the existing simulators that fall under these frameworks, comparing them with respect to their evolutionary and demographic scenarios, their computational complexity, and their specific applications. Additionally, we address some limitations in current simulation algorithms and discuss future challenges in the development of more powerful simulation tools.
doi:10.1089/cmb.2010.0188
PMCID: PMC3244809  PMID: 22149682
backward simulators; disease association study; forward simulators; genome simulation; resampling
3.  Comparative Analysis of Methods for Identifying Recurrent Copy Number Alterations in Cancer 
PLoS ONE  2012;7(12):e52516.
Recurrent copy number alterations (CNAs) play an important role in cancer genesis. While a number of computational methods have been proposed for identifying such CNAs, their relative merits remain largely unknown in practice since very few efforts have been focused on comparative analysis of the methods. To facilitate studies of recurrent CNA identification in cancer genome, it is imperative to conduct a comprehensive comparison of performance and limitations among existing methods. In this paper, six representative methods proposed in the latest six years are compared. These include one-stage and two-stage approaches, working with raw intensity ratio data and discretized data respectively. They are based on various techniques such as kernel regression, correlation matrix diagonal segmentation, semi-parametric permutation and cyclic permutation schemes. We explore multiple criteria including type I error rate, detection power, Receiver Operating Characteristics (ROC) curve and the area under curve (AUC), and computational complexity, to evaluate performance of the methods under multiple simulation scenarios. We also characterize their abilities on applications to two real datasets obtained from cancers with lung adenocarcinoma and glioblastoma. This comparison study reveals general characteristics of the existing methods for identifying recurrent CNAs, and further provides new insights into their strengths and weaknesses. It is believed helpful to accelerate the development of novel and improved methods.
doi:10.1371/journal.pone.0052516
PMCID: PMC3527554  PMID: 23285074
4.  Genome-wide identification of significant aberrations in cancer genome 
BMC Genomics  2012;13:342.
Background
Somatic Copy Number Alterations (CNAs) in human genomes are present in almost all human cancers. Systematic efforts to characterize such structural variants must effectively distinguish significant consensus events from random background aberrations. Here we introduce Significant Aberration in Cancer (SAIC), a new method for characterizing and assessing the statistical significance of recurrent CNA units. Three main features of SAIC include: (1) exploiting the intrinsic correlation among consecutive probes to assign a score to each CNA unit instead of single probes; (2) performing permutations on CNA units that preserve correlations inherent in the copy number data; and (3) iteratively detecting Significant Copy Number Aberrations (SCAs) and estimating an unbiased null distribution by applying an SCA-exclusive permutation scheme.
Results
We test and compare the performance of SAIC against four peer methods (GISTIC, STAC, KC-SMART, CMDS) on a large number of simulation datasets. Experimental results show that SAIC outperforms peer methods in terms of larger area under the Receiver Operating Characteristics curve and increased detection power. We then apply SAIC to analyze structural genomic aberrations acquired in four real cancer genome-wide copy number data sets (ovarian cancer, metastatic prostate cancer, lung adenocarcinoma, glioblastoma). When compared with previously reported results, SAIC successfully identifies most SCAs known to be of biological significance and associated with oncogenes (e.g., KRAS, CCNE1, and MYC) or tumor suppressor genes (e.g., CDKN2A/B). Furthermore, SAIC identifies a number of novel SCAs in these copy number data that encompass tumor related genes and may warrant further studies.
Conclusions
Supported by a well-grounded theoretical framework, SAIC has been developed and used to identify SCAs in various cancer copy number data sets, providing useful information to study the landscape of cancer genomes. Open–source and platform-independent SAIC software is implemented using C++, together with R scripts for data formatting and Perl scripts for user interfacing, and it is easy to install and efficient to use. The source code and documentation are freely available at http://www.cbil.ece.vt.edu/software.htm.
doi:10.1186/1471-2164-13-342
PMCID: PMC3428679  PMID: 22839576
5.  TAGCNA: A Method to Identify Significant Consensus Events of Copy Number Alterations in Cancer 
PLoS ONE  2012;7(7):e41082.
Somatic copy number alteration (CNA) is a common phenomenon in cancer genome. Distinguishing significant consensus events (SCEs) from random background CNAs in a set of subjects has been proven to be a valuable tool to study cancer. In order to identify SCEs with an acceptable type I error rate, better computational approaches should be developed based on reasonable statistics and null distributions. In this article, we propose a new approach named TAGCNA for identifying SCEs in somatic CNAs that may encompass cancer driver genes. TAGCNA employs a peel-off permutation scheme to generate a reasonable null distribution based on a prior step of selecting tag CNA markers from the genome being considered. We demonstrate the statistical power of TAGCNA on simulated ground truth data, and validate its applicability using two publicly available cancer datasets: lung and prostate adenocarcinoma. TAGCNA identifies SCEs that are known to be involved with proto-oncogenes (e.g. EGFR, CDK4) and tumor suppressor genes (e.g. CDKN2A, CDKN2B), and provides many additional SCEs with potential biological relevance in these data. TAGCNA can be used to analyze the significance of CNAs in various cancers. It is implemented in R and is freely available at http://tagcna.sourceforge.net/.
doi:10.1371/journal.pone.0041082
PMCID: PMC3399811  PMID: 22815924
6.  Comparative analysis of methods for detecting interacting loci 
BMC Genomics  2011;12:344.
Background
Interactions among genetic loci are believed to play an important role in disease risk. While many methods have been proposed for detecting such interactions, their relative performance remains largely unclear, mainly because different data sources, detection performance criteria, and experimental protocols were used in the papers introducing these methods and in subsequent studies. Moreover, there have been very few studies strictly focused on comparison of existing methods. Given the importance of detecting gene-gene and gene-environment interactions, a rigorous, comprehensive comparison of performance and limitations of available interaction detection methods is warranted.
Results
We report a comparison of eight representative methods, of which seven were specifically designed to detect interactions among single nucleotide polymorphisms (SNPs), with the last a popular main-effect testing method used as a baseline for performance evaluation. The selected methods, multifactor dimensionality reduction (MDR), full interaction model (FIM), information gain (IG), Bayesian epistasis association mapping (BEAM), SNP harvester (SH), maximum entropy conditional probability modeling (MECPM), logistic regression with an interaction term (LRIT), and logistic regression (LR) were compared on a large number of simulated data sets, each, consistent with complex disease models, embedding multiple sets of interacting SNPs, under different interaction models. The assessment criteria included several relevant detection power measures, family-wise type I error rate, and computational complexity. There are several important results from this study. First, while some SNPs in interactions with strong effects are successfully detected, most of the methods miss many interacting SNPs at an acceptable rate of false positives. In this study, the best-performing method was MECPM. Second, the statistical significance assessment criteria, used by some of the methods to control the type I error rate, are quite conservative, thereby limiting their power and making it difficult to fairly compare them. Third, as expected, power varies for different models and as a function of penetrance, minor allele frequency, linkage disequilibrium and marginal effects. Fourth, the analytical relationships between power and these factors are derived, aiding in the interpretation of the study results. Fifth, for these methods the magnitude of the main effect influences the power of the tests. Sixth, most methods can detect some ground-truth SNPs but have modest power to detect the whole set of interacting SNPs.
Conclusion
This comparison study provides new insights into the strengths and limitations of current methods for detecting interacting loci. This study, along with freely available simulation tools we provide, should help support development of improved methods. The simulation tools are available at: http://code.google.com/p/simulation-tool-bmc-ms9169818735220977/downloads/list.
doi:10.1186/1471-2164-12-344
PMCID: PMC3161015  PMID: 21729295
7.  Probability Theory-based SNP Association Study Method for Identifying Susceptibility Loci and Genetic Disease Models in Human Case-Control Data 
One of the most challenging points in studying human common complex diseases is to search for both strong and weak susceptibility single-nucleotide polymorphisms (SNPs) and identify forms of genetic disease models. Currently, a number of methods have been proposed for this purpose. Many of them have not been validated through applications into various genome datasets, so their abilities are not clear in real practice. In this paper, we present a novel SNP association study method based on probability theory, called ProbSNP. The method firstly detects SNPs by evaluating their joint probabilities in combining with disease status and selects those with the lowest joint probabilities as susceptibility ones, and then identifies some forms of genetic disease models through testing multiple-locus interactions among the selected SNPs. The joint probabilities of combined SNPs are estimated by establishing Gaussian distribution probability density functions, in which the related parameters (i.e., mean value and standard deviation) are evaluated based on allele and haplotype frequencies. Finally, we test and validate the method using various genome datasets. We find that ProbSNP has shown remarkable success in the applications to both simulated genome data and real genome-wide data.
doi:10.1109/TNB.2010.2070805
PMCID: PMC3029504  PMID: 20840904
Association study; SNPs; probability theory; Gaussian distribution; case-control

Results 1-7 (7)