PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-23 (23)
 

Clipboard (0)
None

Select a Filter Below

Journals
more »
Year of Publication
Document Types
1.  Genomic Loss of Tumor Suppressor miRNA-204 Promotes Cancer Cell Migration and Invasion by Activating AKT/mTOR/Rac1 Signaling and Actin Reorganization 
PLoS ONE  2012;7(12):e52397.
Increasing evidence suggests that chromosomal regions containing microRNAs are functionally important in cancers. Here, we show that genomic loci encoding miR-204 are frequently lost in multiple cancers, including ovarian cancers, pediatric renal tumors, and breast cancers. MiR-204 shows drastically reduced expression in several cancers and acts as a potent tumor suppressor, inhibiting tumor metastasis in vivo when systemically delivered. We demonstrated that miR-204 exerts its function by targeting genes involved in tumorigenesis including brain-derived neurotrophic factor (BDNF), a neurotrophin family member which is known to promote tumor angiogenesis and invasiveness. Analysis of primary tumors shows that increased expression of BDNF or its receptor tropomyosin-related kinase B (TrkB) parallel a markedly reduced expression of miR-204. Our results reveal that loss of miR-204 results in BDNF overexpression and subsequent activation of the small GTPase Rac1 and actin reorganization through the AKT/mTOR signaling pathway leading to cancer cell migration and invasion. These results suggest that microdeletion of genomic loci containing miR-204 is directly linked with the deregulation of key oncogenic pathways that provide crucial stimulus for tumor growth and metastasis. Our findings provide a strong rationale for manipulating miR-204 levels therapeutically to suppress tumor metastasis.
doi:10.1371/journal.pone.0052397
PMCID: PMC3528651  PMID: 23285024
2.  A Bayesian decision fusion approach for microRNA target prediction 
BMC Genomics  2012;13(Suppl 8):S13.
MicroRNAs (miRNAs) are 19-25 nucleotides non-coding RNAs known to have important post-transcriptional regulatory functions. The computational target prediction algorithm is vital to effective experimental testing. However, since different existing algorithms rely on different features and classifiers, there is a poor agreement among the results of different algorithms. To benefit from the advantages of different algorithms, we proposed an algorithm called BCmicrO that combines the prediction of different algorithms with Bayesian Network. BCmicrO was evaluated using the training data and the proteomic data. The results show that BCmicrO improves both the sensitivity and the specificity of each individual algorithm. All the related materials including genome-wide prediction of human targets and a web-based tool are available at http://compgenomics.utsa.edu/gene/gene_1.php.
doi:10.1186/1471-2164-13-S8-S13
PMCID: PMC3535698  PMID: 23282032
3.  Reducing confounding and suppression effects in TCGA data: an integrated analysis of chemotherapy response in ovarian cancer 
BMC Genomics  2012;13(Suppl 6):S13.
Background
Despite initial response in adjuvant chemotherapy, ovarian cancer patients treated with the combination of paclitaxel and carboplatin frequently suffer from recurrence after few cycles of treatment, and the underlying mechanisms causing the chemoresistance remain unclear. Recently, The Cancer Genome Atlas (TCGA) research network concluded an ovarian cancer study and released the dataset to the public. The TCGA dataset possesses large sample size, comprehensive molecular profiles, and clinical outcome information; however, because of the unknown molecular subtypes in ovarian cancer and the great diversity of adjuvant treatments TCGA patients went through, studying chemotherapeutic response using the TCGA data is difficult. Additionally, factors such as sample batches, patient ages, and tumor stages further confound or suppress the identification of relevant genes, and thus the biological functions and disease mechanisms.
Results
To address these issues, herein we propose an analysis procedure designed to reduce suppression effect by focusing on a specific chemotherapeutic treatment, and to remove confounding effects such as batch effect, patient's age, and tumor stages. The proposed procedure starts with a batch effect adjustment, followed by a rigorous sample selection process. Then, the gene expression, copy number, and methylation profiles from the TCGA ovarian cancer dataset are analyzed using a semi-supervised clustering method combined with a novel scoring function. As a result, two molecular classifications, one with poor copy number profiles and one with poor methylation profiles, enriched with unfavorable scores are identified. Compared with the samples enriched with favorable scores, these two classifications exhibit poor progression-free survival (PFS) and might be associated with poor chemotherapy response specifically to the combination of paclitaxel and carboplatin. Significant genes and biological processes are detected subsequently using classical statistical approaches and enrichment analysis.
Conclusions
The proposed procedure for the reduction of confounding and suppression effects and the semi-supervised clustering method are essential steps to identify genes associated with the chemotherapeutic response.
doi:10.1186/1471-2164-13-S6-S13
PMCID: PMC3481440  PMID: 23134756
4.  Pathway Distiller - multisource biological pathway consolidation 
BMC Genomics  2012;13(Suppl 6):S18.
Background
One method to understand and evaluate an experiment that produces a large set of genes, such as a gene expression microarray analysis, is to identify overrepresentation or enrichment for biological pathways. Because pathways are able to functionally describe the set of genes, much effort has been made to collect curated biological pathways into publicly accessible databases. When combining disparate databases, highly related or redundant pathways exist, making their consolidation into pathway concepts essential. This will facilitate unbiased, comprehensive yet streamlined analysis of experiments that result in large gene sets.
Methods
After gene set enrichment finds representative pathways for large gene sets, pathways are consolidated into representative pathway concepts. Three complementary, but different methods of pathway consolidation are explored. Enrichment Consolidation combines the set of the pathways enriched for the signature gene list through iterative combining of enriched pathways with other pathways with similar signature gene sets; Weighted Consolidation utilizes a Protein-Protein Interaction network based gene-weighting approach that finds clusters of both enriched and non-enriched pathways limited to the experiments' resultant gene list; and finally the de novo Consolidation method uses several measurements of pathway similarity, that finds static pathway clusters independent of any given experiment.
Results
We demonstrate that the three consolidation methods provide unified yet different functional insights of a resultant gene set derived from a genome-wide profiling experiment. Results from the methods are presented, demonstrating their applications in biological studies and comparing with a pathway web-based framework that also combines several pathway databases. Additionally a web-based consolidation framework that encompasses all three methods discussed in this paper, Pathway Distiller (http://cbbiweb.uthscsa.edu/PathwayDistiller), is established to allow researchers access to the methods and example microarray data described in this manuscript, and the ability to analyze their own gene list by using our unique consolidation methods.
Conclusions
By combining several pathway systems, implementing different, but complementary pathway consolidation methods, and providing a user-friendly web-accessible tool, we have enabled users the ability to extract functional explanations of their genome wide experiments.
doi:10.1186/1471-2164-13-S6-S18
PMCID: PMC3481446  PMID: 23134636
6.  Androgen-Responsive MicroRNAs in Mouse Sertoli Cells 
PLoS ONE  2012;7(7):e41146.
Although decades of research have established that androgen is essential for spermatogenesis, androgen's mechanism of action remains elusive. This is in part because only a few androgen-responsive genes have been definitively identified in the testis. Here, we propose that microRNAs – small, non-coding RNAs – are one class of androgen-regulated trans-acting factors in the testis. Specifically, by using androgen suppression and androgen replacement in mice, we show that androgen regulates the expression of several microRNAs in Sertoli cells. Our results reveal that several of these microRNAs are preferentially expressed in the testis and regulate genes that are highly expressed in Sertoli cells. Because androgen receptor-mediated signaling is essential for the pre- and post-meiotic germ cell development, we propose that androgen controls these events by regulating Sertoli/germ cell-specific gene expression in a microRNA-dependent manner.
doi:10.1371/journal.pone.0041146
PMCID: PMC3401116  PMID: 22911753
7.  Evidence for an Unanticipated Relationship Between Undifferentiated Pleomorphic Sarcoma and Embryonal Rhabdomyosarcoma 
Cancer cell  2011;19(2):177-191.
SUMMARY
Embryonal rhabdomyosarcoma (eRMS) shows the most myodifferentiation amongst sarcomas, yet the precise cell of origin remains undefined. Using Ptch1, p53 and/or Rb1 conditional mouse models and controlling prenatal or postnatal myogenic cell of origin, we demonstrate that eRMS and undifferentiated pleomorphic sarcoma (UPS) lie in a continuum, with satellite cells predisposed to giving rise to UPS. Conversely, p53 loss in maturing myoblasts gives rise to eRMS, which have the highest myodifferentiation potential. Irrespective of origin, Rb1 loss modifies tumor phenotype to mimic UPS. In human sarcomas that lack pathognomic chromosomal translocations, p53 loss of function is prevalent whereas Shh or Rb1 alterations likely act primarily as modifiers. Thus, sarcoma phenotype is strongly influenced by cell of origin and mutational profile.
doi:10.1016/j.ccr.2010.12.023
PMCID: PMC3040414  PMID: 21316601
8.  Bayesian non-negative factor analysis for reconstructing transcription factor mediated regulatory networks 
Proteome Science  2011;9(Suppl 1):S9.
Background
Transcriptional regulation by transcription factor (TF) controls the time and abundance of mRNA transcription. Due to the limitation of current proteomics technologies, large scale measurements of protein level activities of TFs is usually infeasible, making computational reconstruction of transcriptional regulatory network a difficult task.
Results
We proposed here a novel Bayesian non-negative factor model for TF mediated regulatory networks. Particularly, the non-negative TF activities and sample clustering effect are modeled as the factors from a Dirichlet process mixture of rectified Gaussian distributions, and the sparse regulatory coefficients are modeled as the loadings from a sparse distribution that constrains its sparsity using knowledge from database; meantime, a Gibbs sampling solution was developed to infer the underlying network structure and the unknown TF activities simultaneously. The developed approach has been applied to simulated system and breast cancer gene expression data. Result shows that, the proposed method was able to systematically uncover TF mediated transcriptional regulatory network structure, the regulatory coefficients, the TF protein level activities and the sample clustering effect. The regulation target prediction result is highly coordinated with the prior knowledge, and sample clustering result shows superior performance over previous molecular based clustering method.
Conclusions
The results demonstrated the validity and effectiveness of the proposed approach in reconstructing transcriptional networks mediated by TFs through simulated systems and real data.
doi:10.1186/1477-5956-9-S1-S9
PMCID: PMC3289087  PMID: 22166063
9.  A model-based circular binary segmentation algorithm for the analysis of array CGH data 
BMC Research Notes  2011;4:394.
Background
Circular Binary Segmentation (CBS) is a permutation-based algorithm for array Comparative Genomic Hybridization (aCGH) data analysis. CBS accurately segments data by detecting change-points using a maximal-t test; but extensive computational burden is involved for evaluating the significance of change-points using permutations. A recent implementation utilizing a hybrid method and early stopping rules (hybrid CBS) to improve the performance in speed was subsequently proposed. However, a time analysis revealed that a major portion of computation time of the hybrid CBS was still spent on permutation. In addition, what the hybrid method provides is an approximation of the significance upper bound or lower bound, not an approximation of the significance of change-points itself.
Results
We developed a novel model-based algorithm, extreme-value based CBS (eCBS), which limits permutations and provides robust results without loss of accuracy. Thousands of aCGH data under null hypothesis were simulated in advance based on a variety of non-normal assumptions, and the corresponding maximal-t distribution was modeled by the Generalized Extreme Value (GEV) distribution. The modeling results, which associate characteristics of aCGH data to the GEV parameters, constitute lookup tables (eXtreme model). Using the eXtreme model, the significance of change-points could be evaluated in a constant time complexity through a table lookup process.
Conclusions
A novel algorithm, eCBS, was developed in this study. The current implementation of eCBS consistently outperforms the hybrid CBS 4× to 20× in computation time without loss of accuracy. Source codes, supplementary materials, supplementary figures, and supplementary tables can be found at http://ntumaps.cgm.ntu.edu.tw/eCBSsupplementary.
doi:10.1186/1756-0500-4-394
PMCID: PMC3224564  PMID: 21985277
10.  Preferential Localization of Human Origins of DNA Replication at the 5′-Ends of Expressed Genes and at Evolutionarily Conserved DNA Sequences 
PLoS ONE  2011;6(5):e17308.
Background
Replication of mammalian genomes requires the activation of thousands of origins which are both spatially and temporally regulated by as yet unknown mechanisms. At the most fundamental level, our knowledge about the distribution pattern of origins in each of the chromosomes, among different cell types, and whether the physiological state of the cells alters this distribution is at present very limited.
Methodology/Principal Findings
We have used standard λ-exonuclease resistant nascent DNA preparations in the size range of 0.7–1.5 kb obtained from the breast cancer cell line MCF–7 hybridized to a custom tiling array containing 50–60 nt probes evenly distributed among genic and non-genic regions covering about 1% of the human genome. A similar DNA preparation was used for high-throughput DNA sequencing. Array experiments were also performed with DNA obtained from BT-474 and H520 cell lines. By determining the sites showing nascent DNA enrichment, we have localized several thousand origins of DNA replication. Our major findings are: (a) both array and DNA sequencing assay methods produced essentially the same origin distribution profile; (b) origin distribution is largely conserved (>70%) in all cell lines tested; (c) origins are enriched at the 5′ends of expressed genes and at evolutionarily conserved intergenic sequences; and (d) ChIP on chip experiments in MCF-7 showed an enrichment of H3K4Me3 and RNA Polymerase II chromatin binding sites at origins of DNA replication.
Conclusions/Significance
Our results suggest that the program for origin activation is largely conserved among different cell types. Also, our work supports recent studies connecting transcription initiation with replication, and in addition suggests that evolutionarily conserved intergenic sequences have the potential to participate in origin selection. Overall, our observations suggest that replication origin selection is a stochastic process significantly dependent upon local accessibility to replication factors.
doi:10.1371/journal.pone.0017308
PMCID: PMC3094316  PMID: 21602917
11.  14-3-3 σ Expression Effects G2/M Response to Oxygen and Correlates with Ovarian Cancer Metastasis 
PLoS ONE  2011;6(1):e15864.
Background
In vitro cell culture experiments with primary cells have reported that cell proliferation is retarded in the presence of ambient compared to physiological O2 levels. Cancer is primarily a disease of aberrant cell proliferation, therefore, studying cancer cells grown under ambient O2 may be undesirable. To understand better the impact of O2 on the propagation of cancer cells in vitro, we compared the growth potential of a panel of ovarian cancer cell lines under ambient (21%) or physiological (3%) O2.
Principal Findings
Our observations demonstrate that similar to primary cells, many cancer cells maintain an inherent sensitivity to O2, but some display insensitivity to changes in O2 concentration. Further analysis revealed an association between defective G2/M cell cycle transition regulation and O2 insensitivity resultant from overexpression of 14-3-3 σ. Targeting 14-3-3 σ overexpression with RNAi restored O2 sensitivity in these cell lines. Additionally, we found that metastatic ovarian tumors frequently overexpress 14-3-3 σ, which in conjunction with phosphorylated RB, results in poor prognosis.
Conclusions
Cancer cells show differential proliferative sensitivity to changes in O2 concentration. Although a direct link between O2 insensitivity and metastasis was not determined, this investigation showed that an O2 insensitive phenotype in cancer cells to correlate with metastatic tumor progression.
doi:10.1371/journal.pone.0015864
PMCID: PMC3018427  PMID: 21249227
12.  Robust inference of the context specific structure and temporal dynamics of gene regulatory network 
BMC Genomics  2010;11(Suppl 3):S11.
Background
Response of cells to changing endogenous or exogenous conditions is governed by intricate molecular interactions, or regulatory networks. To lead to appropriate responses, regulatory network should be 1) context-specific, i.e., its constituents and topology depend on the phonotypical and experimental context including tissue types and cell conditions, such as damage, stress, macroenvironments of cell, etc. and 2) time varying, i.e., network elements and their regulatory roles change actively over time to control the endogenous cell states e.g. different stages in a cell cycle.
Results
A novel network model PathRNet and a reconstruction approach PATTERN are proposed for reconstructing the context specific time varying regulatory networks by integrating microarray gene expression profiles and existing knowledge of pathways and transcription factors. The nodes of the PathRNet are Transcription Factors (TFs) and pathways, and edges represent the regulation between pathways and TFs. The reconstructed PathRNet for Kaposi's sarcoma-associated herpesvirus infection of human endothelial cells reveals the complicated dynamics of the underlying regulatory mechanisms that govern this intricate process. All the related materials including source code are available at http://compgenomics.utsa.edu/tvnet.html.
Conclusions
The proposed PathRNet provides a system level landscape of the dynamics of gene regulatory circuitry. The inference approach PATTERN enables robust reconstruction of the temporal dynamics of pathway-centric regulatory networks. The proposed approach for the first time provides a dynamic perspective of pathway, TF regulations, and their interaction related to specific endogenous and exogenous conditions.
doi:10.1186/1471-2164-11-S3-S11
PMCID: PMC2999341  PMID: 21143778
13.  A Bayesian approach for identifying miRNA targets by combining sequence prediction and gene expression profiling 
BMC Genomics  2010;11(Suppl 3):S12.
Background
MicroRNAs (miRNAs) are single-stranded non-coding RNAs shown to plays important regulatory roles in a wide range of biological processes and diseases. The functions and regulatory mechanisms of most of miRNAs are still poorly understood in part because of the difficulty in identifying the miRNA regulatory targets. To this end, computational methods have evolved as important tools for genome-wide target screening. Although considerable work in the past few years has produced many target prediction algorithms, most of them are solely based on sequence, and the accuracy is still poor. In contrast, gene expression profiling from miRNA transfection experiments can provide additional information about miRNA targets. However, most of existing research assumes down-regulated mRNAs as targets. Given the fact that the primary function of miRNA is protein inhibition, this assumption is neither sufficient nor necessary.
Results
A novel Bayesian approach is proposed in this paper that integrates sequence level prediction with expression profiling of miRNA transfection. This approach does not restrict the target to be down-expressed and thus improve the performance of existing target prediction algorithm. The proposed algorithm was tested on simulated data, proteomics data, and IP pull-down data and shown to achieve better performance than existing approaches for target prediction. All the related materials including source code are available at http://compgenomics.utsa.edu/expmicro.html.
Conclusions
The proposed Bayesian algorithm integrates properly the sequence paring data and mRNA expression profiles for miRNA target prediction. This algorithm is shown to have better prediction performance than existing algorithms.
doi:10.1186/1471-2164-11-S3-S12
PMCID: PMC2999342  PMID: 21143779
14.  Improving performance of mammalian microRNA target prediction 
BMC Bioinformatics  2010;11:476.
Background
MicroRNAs (miRNAs) are single-stranded non-coding RNAs known to regulate a wide range of cellular processes by silencing the gene expression at the protein and/or mRNA levels. Computational prediction of miRNA targets is essential for elucidating the detailed functions of miRNA. However, the prediction specificity and sensitivity of the existing algorithms are still poor to generate meaningful, workable hypotheses for subsequent experimental testing. Constructing a richer and more reliable training data set and developing an algorithm that properly exploits this data set would be the key to improve the performance current prediction algorithms.
Results
A comprehensive training data set is constructed for mammalian miRNAs with its positive targets obtained from the most up-to-date miRNA target depository called miRecords and its negative targets derived from 20 microarray data. A new algorithm SVMicrO is developed, which assumes a 2-stage structure including a site support vector machine (SVM) followed by a UTR-SVM. SVMicrO makes prediction based on 21 optimal site features and 18 optimal UTR features, selected by training from a comprehensive collection of 113 site and 30 UTR features. Comprehensive evaluation of SVMicrO performance has been carried out on the training data, proteomics data, and immunoprecipitation (IP) pull-down data. Comparisons with some popular algorithms demonstrate consistent improvements in prediction specificity, sensitivity and precision in all tested cases. All the related materials including source code and genome-wide prediction of human targets are available at http://compgenomics.utsa.edu/svmicro.html.
Conclusions
A 2-stage SVM based new miRNA target prediction algorithm called SVMicrO is developed. SVMicrO is shown to be able to achieve robust performance. It holds the promise to achieve continuing improvement whenever better training data that contain additional verified or high confidence positive targets and properly selected negative targets are available.
doi:10.1186/1471-2105-11-476
PMCID: PMC2955701  PMID: 20860840
15.  MicroRNA Expression, Survival, and Response to Interferon in Liver Cancer 
The New England journal of medicine  2009;361(15):1437-1447.
Background
Hepatocellular carcinoma is a common and aggressive cancer that occurs mainly in men. We examined microRNA expression patterns, survival, and response to interferon alfa in both men and women with the disease.
Methods
We analyzed three independent cohorts that included a total of 455 patients with hepatocellular carcinoma who had undergone radical tumor resection between 1999 and 2003. MicroRNA-expression profiling was performed in a cohort of 241 patients with hepatocellular carcinoma to identify tumor-related microRNAs and determine their association with survival in men and women. In addition, to validate our findings, we used quantitative reverse-transcriptase–polymerase-chain-reaction assays to measure microRNAs and assess their association with survival and response to therapy with interferon alfa in 214 patients from two independent, prospective, randomized, controlled trials of adjuvant interferon therapy.
Results
In patients with hepatocellular carcinoma, the expression of miR-26a and miR-26b in nontumor liver tissue was higher in women than in men. Tumors had reduced levels of miR-26 expression, as compared with paired noncancerous tissues, which indicated that the level of miR-26 expression was also associated with hepatocellular carcinoma. Moreover, tumors with reduced miR-26 expression had a distinct transcriptomic pattern, and analyses of gene networks revealed that activation of signaling pathways between nuclear factor κB and interleukin-6 might play a role in tumor development. Patients whose tumors had low miR-26 expression had shorter overall survival but a better response to interferon therapy than did patients whose tumors had high expression of the microRNA.
Conclusions
The expression patterns of microRNAs in liver tissue differ between men and women with hepatocellular carcinoma. The miR-26 expression status of such patients is associated with survival and response to adjuvant therapy with interferon alfa.
doi:10.1056/NEJMoa0901282
PMCID: PMC2786938  PMID: 19812400
16.  Gpnmb is a Melanoblast-Expressed, MITF-Dependent Gene 
Pigment cell & melanoma research  2008;22(1):99-110.
SUMMARY
Expression profile analysis clusters Gpnmb with known pigment genes, Tyrp1, Dct, and Si. During development, Gpnmb is expressed in a pattern similar to Mitf, Dct and Si with expression vastly reduced in Mitf mutant animals. Unlike Dct and Si, Gpnmb remains expressed in a discrete population of caudal melanoblasts in Sox10-deficient embryos. To understand the transcriptional regulation of Gpnmb we performed a whole genome annotation of 2,460,048 consensus MITF binding sites, and cross-referenced this with evolutionarily conserved genomic sequences at the GPNMB locus. One conserved element, GPNMB-MCS3, contained two MITF consensus sites, significantly increased luciferase activity in melanocytes and was sufficient to drive expression in melanoblasts in vivo. Deletion of the 5’-most MITF consensus site dramatically reduced enhancer activity indicating a significant role for this site in Gpnmb transcriptional regulation. Future analysis of the Gpnmb locus will provide insight into the transcriptional regulation of melanocytes and Gpnmb expression can be used as a marker for analyzing melanocyte development and disease progression.
SIGNIFICANCE
Comparative analysis of gene expression profiles using melanocyte lines derived from mice provides a powerful resource to explore genetic components of melanocyte development and pigment cell function. Using expression data, we identified Gpnmb as a new marker for early melanoblast development. We show that Gpnmb is dependent on Mitf for in vivo expression and marks a unique set of Sox10-independent melanoblasts. We identified an 89 basepair evolutionarily conserved genomic sequence at the Gpnmb locus that can enhance expression in melanocytes and tested MITF E-box consensus sequences for their involvement in melanocyte-restricted expression. Gpnmb and the panel of genes identified in this study will be valuable resources for understanding the genetic components involved in melanocyte development and diseases.
doi:10.1111/j.1755-148X.2008.00518.x
PMCID: PMC2714741  PMID: 18983539
Gpnmb; Mitf; Sox10; melanoblast; melanocyte; melanoma
17.  A probe-density-based analysis method for array CGH data: simulation, normalization and centralization 
Bioinformatics  2008;24(16):1749-1756.
Motivation: Genomic instability is one of the fundamental factors in tumorigenesis and tumor progression. Many studies have shown that copy-number abnormalities at the DNA level are important in the pathogenesis of cancer. Array comparative genomic hybridization (aCGH), developed based on expression microarray technology, can reveal the chromosomal aberrations in segmental copies at a high resolution. However, due to the nature of aCGH, many standard expression data processing tools, such as data normalization, often fail to yield satisfactory results.
Results: We demonstrated a novel aCGH normalization algorithm, which provides an accurate aCGH data normalization by utilizing the dependency of neighboring probe measurements in aCGH experiments. To facilitate the study, we have developed a hidden Markov model (HMM) to simulate a series of aCGH experiments with random DNA copy number alterations that are used to validate the performance of our normalization. In addition, we applied the proposed normalization algorithm to an aCGH study of lung cancer cell lines. By using the proposed algorithm, data quality and the reliability of experimental results are significantly improved, and the distinct patterns of DNA copy number alternations are observed among those lung cancer cell lines.
Contact: chuangey@ntu.edu.tw
Supplementary information: Source codes and.gures may be found at http://ntumaps.cgm.ntu.edu.tw/aCGH_supplementary
doi:10.1093/bioinformatics/btn321
PMCID: PMC2732214  PMID: 18603568
18.  GEOmetadb: powerful alternative search engine for the Gene Expression Omnibus 
Bioinformatics  2008;24(23):2798-2800.
The NCBI Gene Expression Omnibus (GEO) represents the largest public repository of microarray data. However, finding data in GEO can be challenging. We have developed GEOmetadb in an attempt to make querying the GEO metadata both easier and more powerful. All GEO metadata records as well as the relationships between them are parsed and stored in a local MySQL database. A powerful, flexible web search interface with several convenient utilities provides query capabilities not available via NCBI tools. In addition, a Bioconductor package, GEOmetadb that utilizes a SQLite export of the entire GEOmetadb database is also available, rendering the entire GEO database accessible with full power of SQL-based queries from within R.
Availability: The web interface and SQLite databases available at http://gbnci.abcc.ncifcrf.gov/geo/. The Bioconductor package is available via the Bioconductor project. The corresponding MATLAB implementation is also available at the same website.
Contact: yidong@mail.nih.gov
doi:10.1093/bioinformatics/btn520
PMCID: PMC2639278  PMID: 18842599
19.  Comparison of Expression Profiles of Metastatic versus Primary Mammary Tumors in MMTV-Wnt-1 and MMTV-Neu Transgenic Mice1 
Neoplasia (New York, N.Y.)  2008;10(2):118-124.
Distant metastases of human breast cancers have been suggested to be more different from each other than from their respective primary tumors, based on expression profiling. The mechanism behind this lack of similarity between individual metastases is not known. We used cDNA microarrays to determine the expression profiles of pulmonary metastases and primary mammary tumors in two distinct transgenic models expressing either the Neu or the Wnt-1 oncogene from the mouse mammary tumor virus long terminal repeat (MMTV LTR). We found that pulmonary metastases are similar to each other and to their primary tumors within the same line. However, metastases arising in one transgenic mouse line are very different from either metastases or primary tumors arising in the other line. In addition, we found that, like their primary tumors, lung metastases in Wnt-1 transgenic mice harbor both epithelial and myoepithelial tumor cells and cells that express the putative progenitor cell marker keratin 6. Our data suggest that both gene expression profiles and cellular heterogeneity are preserved after breast cancer has spread to distant sites, and that metastases are similar to each other when their primary tumors were induced by the same oncogene and from the same subset of mammary cells.
PMCID: PMC2244686  PMID: 18283333
20.  Normalization Benefits Microarray-Based Classification 
When using cDNA microarrays, normalization to correct labeling bias is a common preliminary step before further data analysis is applied, its objective being to reduce the variation between arrays. To date, assessment of the effectiveness of normalization has mainly been confined to the ability to detect differentially expressed genes. Since a major use of microarrays is the expression-based phenotype classification, it is important to evaluate microarray normalization procedures relative to classification. Using a model-based approach, we model the systemic-error process to generate synthetic gene-expression values with known ground truth. These synthetic expression values are subjected to typical normalization methods and passed through a set of classification rules, the objective being to carry out a systematic study of the effect of normalization on classification. Three normalization methods are considered: offset, linear regression, and Lowess regression. Seven classification rules are considered: 3-nearest neighbor, linear support vector machine, linear discriminant analysis, regular histogram, Gaussian kernel, perceptron, and multiple perceptron with majority voting. The results of the first three are presented in the paper, with the full results being given on a complementary website. The conclusion from the different experiment models considered in the study is that normalization can have a significant benefit for classification under difficult experimental conditions, with linear and Lowess regression slightly outperforming the offset method.
doi:10.1155/BSB/2006/43056
PMCID: PMC3171318  PMID: 18427588
21.  Changes in gene expression during the development of mammary tumors in MMTV-Wnt-1 transgenic mice 
Genome Biology  2005;6(10):R84.
cDNA microarray-derived expression profiles of MMTV-Wnt-1 and MMTV-Neu transgenic mice reveal several hundred genes to be differentially expressed at each stage of breast tumor development.
Background
In human breast cancer normal mammary cells typically develop into hyperplasia, ductal carcinoma in situ, invasive cancer, and metastasis. The changes in gene expression associated with this stepwise progression are unclear. Mice transgenic for mouse mammary tumor virus (MMTV)-Wnt-1 exhibit discrete steps of mammary tumorigenesis, including hyperplasia, invasive ductal carcinoma, and distant metastasis. These mice might therefore be useful models for discovering changes in gene expression during cancer development.
Results
We used cDNA microarrays to determine the expression profiles of five normal mammary glands, seven hyperplastic mammary glands and 23 mammary tumors from MMTV-Wnt-1 transgenic mice, and 12 mammary tumors from MMTV-Neu transgenic mice. Adipose tissues were used to control for fat cells in the vicinity of the mammary glands. In these analyses, we found that the progression of normal virgin mammary glands to hyperplastic tissues and to mammary tumors is accompanied by differences in the expression of several hundred genes at each step. Some of these differences appear to be unique to the effects of Wnt signaling; others seem to be common to tumors induced by both Neu and Wnt-1 oncogenes.
Conclusion
We described gene-expression patterns associated with breast-cancer development in mice, and identified genes that may be significant targets for oncogenic events. The expression data developed provide a resource for illuminating the molecular mechanisms involved in breast cancer development, especially through the identification of genes that are critical in cancer initiation and progression.
doi:10.1186/gb-2005-6-10-r84
PMCID: PMC1257467  PMID: 16207355
22.  High-Resolution Analysis of Gene Copy Number Alterations in Human Prostate Cancer Using CGH on cDNA Microarrays: Impact of Copy Number on Gene Expression1 
Neoplasia (New York, N.Y.)  2004;6(3):240-247.
Abstract
Identification of target genes for genetic rearrangements in prostate cancer and the impact of copy number changes on gene expression are currently not well understood. Here, we applied high-resolution comparative genomic hybridization (CGH) on cDNA microarrays for analysis of prostate cancer cell lines. CGH microarrays identified most of the alterations detected by classical chromosomal CGH, as well as a number of previously unreported alterations. Specific recurrent regions of gain (28) and loss (18) were found, and their boundaries defined with sub-megabasepair accuracy. The most common changes included copy number decreases at 13q, and gains at 1q and 5p. Refined mapping identified several sites, such as at 13q (33–44, 49–51, and 74–76 Mbp from the p-telomere), which matched with minimal regions of loss seen in extensive loss of heterozygosity mapping studies of large numbers of tumors. Previously unreported recurrent changes were found at 2p, 2q, 3p, and 17q (losses), and at 3q, 5p, and 6p (gains). Integration of genomic and transcriptomic data revealed the role of individual candidate target genes for genomic alterations as well as a highly significant (P < .0001) overall association between copy number levels and the percentage of differentially expressed genes. Across the genome, the overall impact of copy number on gene expression levels was, to a large extent, attributable to low-level gains and losses of copy number, corresponding to common deletions and gains of often large chromosomal regions.
PMCID: PMC1502104  PMID: 15153336
Copy number alteration; prostate cancer; gene expression; cDNA microarray; CGH microarray; CGH, comparative genomic hybridization; FISH, fluorescence in situ hybridization; BAC, bacterial artificial chromosome
23.  Transcription Program of Human Herpesvirus 8 (Kaposi's Sarcoma-Associated Herpesvirus) 
Journal of Virology  2001;75(10):4843-4853.
Human herpesvirus 8 (HHV-8), a gammaherpesvirus implicated in Kaposi's sarcoma, primary effusion lymphoma, and Castleman's disease, encodes several pathogenically important cellular homologs. To define the HHV-8 transcription program, RNA obtained from latently infected body cavity-based lymphoma 1 cells induced to undergo lytic replication was used to query a custom HHV-8 DNA microarray containing nearly every known viral open reading frame. The patterns of viral gene expression offer insights into the replication and pathogenic strategies of HHV-8.
doi:10.1128/JVI.75.10.4843-4853.2001
PMCID: PMC114239  PMID: 11312356

Results 1-23 (23)