PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-18 (18)
 

Clipboard (0)
None

Select a Filter Below

Journals
more »
Year of Publication
Document Types
1.  DDN: a caBIG® analytical tool for differential network analysis 
Bioinformatics  2011;27(7):1036-1038.
Summary: Differential dependency network (DDN) is a caBIG® (cancer Biomedical Informatics Grid) analytical tool for detecting and visualizing statistically significant topological changes in transcriptional networks representing two biological conditions. Developed under caBIG® 's In Silico Research Centers of Excellence (ISRCE) Program, DDN enables differential network analysis and provides an alternative way for defining network biomarkers predictive of phenotypes. DDN also serves as a useful systems biology tool for users across biomedical research communities to infer how genetic, epigenetic or environment variables may affect biological networks and clinical phenotypes. Besides the standalone Java application, we have also developed a Cytoscape plug-in, CytoDDN, to integrate network analysis and visualization seamlessly.
Availability: The Java and MATLAB source code can be downloaded at the authors' web site http://www.cbil.ece.vt.edu/software.htm
Contact: yuewang@vt.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btr052
PMCID: PMC3065688  PMID: 21296752
2.  PUGSVM: a caBIGTM analytical tool for multiclass gene selection and predictive classification 
Bioinformatics  2010;27(5):736-738.
Summary: Phenotypic Up-regulated Gene Support Vector Machine (PUGSVM) is a cancer Biomedical Informatics Grid (caBIG™) analytical tool for multiclass gene selection and classification. PUGSVM addresses the problem of imbalanced class separability, small sample size and high gene space dimensionality, where multiclass gene markers are defined by the union of one-versus-everyone phenotypic upregulated genes, and used by a well-matched one-versus-rest support vector machine. PUGSVM provides a simple yet more accurate strategy to identify statistically reproducible mechanistic marker genes for characterization of heterogeneous diseases.
Availability: http://www.cbil.ece.vt.edu/caBIG-PUGSVM.htm.
Contact: yuewang@vt.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btq721
PMCID: PMC3042183  PMID: 21186245
3.  Elevated CA125 in primary peritoneal serous psammocarcinoma: a case report and review of the literature 
BMJ Case Reports  2009;2009:bcr10.2008.1063.
Psammocarcinoma is a rare form of serous carcinoma of the ovary or peritoneum, and it is characterised by extensive psammoma body formation and invasion of surrounding structures. This report describes the case of a 42-year-old woman who presented with large ascites and raised CA125 level. Following a full staging laparotomy, she was made stable in stage IIIC. Despite the limited number of cases reported, the clinical prognosis of carcinomas resembling the serous borderline lesions seems much more favourable than the common serous carcinomas. A summary of all the reported cases is provided to highlight the clinical and prognostic features of this scarce tumour.
doi:10.1136/bcr.10.2008.1063
PMCID: PMC3028158  PMID: 21686474
4.  Motif-guided sparse decomposition of gene expression data for regulatory module identification 
BMC Bioinformatics  2011;12:82.
Background
Genes work coordinately as gene modules or gene networks. Various computational approaches have been proposed to find gene modules based on gene expression data; for example, gene clustering is a popular method for grouping genes with similar gene expression patterns. However, traditional gene clustering often yields unsatisfactory results for regulatory module identification because the resulting gene clusters are co-expressed but not necessarily co-regulated.
Results
We propose a novel approach, motif-guided sparse decomposition (mSD), to identify gene regulatory modules by integrating gene expression data and DNA sequence motif information. The mSD approach is implemented as a two-step algorithm comprising estimates of (1) transcription factor activity and (2) the strength of the predicted gene regulation event(s). Specifically, a motif-guided clustering method is first developed to estimate the transcription factor activity of a gene module; sparse component analysis is then applied to estimate the regulation strength, and so predict the target genes of the transcription factors. The mSD approach was first tested for its improved performance in finding regulatory modules using simulated and real yeast data, revealing functionally distinct gene modules enriched with biologically validated transcription factors. We then demonstrated the efficacy of the mSD approach on breast cancer cell line data and uncovered several important gene regulatory modules related to endocrine therapy of breast cancer.
Conclusion
We have developed a new integrated strategy, namely motif-guided sparse decomposition (mSD) of gene expression data, for regulatory module identification. The mSD method features a novel motif-guided clustering method for transcription factor activity estimation by finding a balance between co-regulation and co-expression. The mSD method further utilizes a sparse decomposition method for regulation strength estimation. The experimental results show that such a motif-guided strategy can provide context-specific regulatory modules in both yeast and breast cancer studies.
doi:10.1186/1471-2105-12-82
PMCID: PMC3072956  PMID: 21426557
5.  Knowledge-guided gene ranking by coordinative component analysis 
BMC Bioinformatics  2010;11:162.
Background
In cancer, gene networks and pathways often exhibit dynamic behavior, particularly during the process of carcinogenesis. Thus, it is important to prioritize those genes that are strongly associated with the functionality of a network. Traditional statistical methods are often inept to identify biologically relevant member genes, motivating researchers to incorporate biological knowledge into gene ranking methods. However, current integration strategies are often heuristic and fail to incorporate fully the true interplay between biological knowledge and gene expression data.
Results
To improve knowledge-guided gene ranking, we propose a novel method called coordinative component analysis (COCA) in this paper. COCA explicitly captures those genes within a specific biological context that are likely to be expressed in a coordinative manner. Formulated as an optimization problem to maximize the coordinative effort, COCA is designed to first extract the coordinative components based on a partial guidance from knowledge genes and then rank the genes according to their participation strengths. An embedded bootstrapping procedure is implemented to improve statistical robustness of the solutions. COCA was initially tested on simulation data and then on published gene expression microarray data to demonstrate its improved performance as compared to traditional statistical methods. Finally, the COCA approach has been applied to stem cell data to identify biologically relevant genes in signaling pathways. As a result, the COCA approach uncovers novel pathway members that may shed light into the pathway deregulation in cancers.
Conclusion
We have developed a new integrative strategy to combine biological knowledge and microarray data for gene ranking. The method utilizes knowledge genes for a guidance to first extract coordinative components, and then rank the genes according to their contribution related to a network or pathway. The experimental results show that such a knowledge-guided strategy can provide context-specific gene ranking with an improved performance in pathway member identification.
doi:10.1186/1471-2105-11-162
PMCID: PMC2865494  PMID: 20353603
6.  Differential dependency network analysis to identify condition-specific topological changes in biological networks 
Bioinformatics  2008;25(4):526-532.
Motivation: Significant efforts have been made to acquire data under different conditions and to construct static networks that can explain various gene regulation mechanisms. However, gene regulatory networks are dynamic and condition-specific; under different conditions, networks exhibit different regulation patterns accompanied by different transcriptional network topologies. Thus, an investigation on the topological changes in transcriptional networks can facilitate the understanding of cell development or provide novel insights into the pathophysiology of certain diseases, and help identify the key genetic players that could serve as biomarkers or drug targets.
Results: Here, we report a differential dependency network (DDN) analysis to detect statistically significant topological changes in the transcriptional networks between two biological conditions. We propose a local dependency model to represent the local structures of a network by a set of conditional probabilities. We develop an efficient learning algorithm to learn the local dependency model using the Lasso technique. A permutation test is subsequently performed to estimate the statistical significance of each learned local structure. In testing on a simulation dataset, the proposed algorithm accurately detected all the genes with network topological changes. The method was then applied to the estrogen-dependent T-47D estrogen receptor-positive (ER+) breast cancer cell line datasets and human and mouse embryonic stem cell datasets. In both experiments using real microarray datasets, the proposed method produced biologically meaningful results. We expect DDN to emerge as an important bioinformatics tool in transcriptional network analyses. While we focus specifically on transcriptional networks, the DDN method we introduce here is generally applicable to other biological networks with similar characteristics.
Availability: The DDN MATLAB toolbox and experiment data are available at http://www.cbil.ece.vt.edu/software.htm.
Contact: yuewang@vt.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btn660
PMCID: PMC2642641  PMID: 19112081
7.  Identifying Conserved and Divergent Transcriptional Modules by Cross-species Matrix Decomposition on Microarray Data 
Cross-species comparison of gene expression profiles allows deciphering fundamental and species-specific transcriptional programs of cells and offers insight into organization and evolution of the genome and genetic network. Here, we propose an algorithm for comparing microarray data from different species to unravel transcriptional modules that are conserved or divergent through evolution. The proposed algorithm is based on cross-species matrix decomposition that includes a nonlinear independent component analysis followed a generalized probabilistic sparse matrix factorization on microarray data from different species. The proposed algorithm captures transcriptional modularity that might result from highly nonlinear interactions among genes, and partitions genes into mutually non-exclusive transcriptional modules. The conserved transcriptional modules are identified by the latent variables that are associated with predominant biological prototypes shared across species. We illustrated the application of the proposed algorithm by an analysis of human and mouse embryonic stem cell (ESC) data. The analysis uncovered conserved and divergent transcriptional modules in the ESC transcriptomes, shedding light on the understanding of fundamental and species-specific regulatory mechanisms controlling ESC development.
doi:10.4172/jpb.1000068
PMCID: PMC2817969  PMID: 20148181
comparative transcriptomics; transcriptional modules; generalized probabilistic sparse matrix factorization; embryonic stem cells
8.  Exploring pathways from gene coexpression to network dynamics 
One of major challenges in post genomic research is to understand how physiological and pathological phenotypes arise from the networks or connectivity of expressed genes. In addressing this issue, we have developed two computational algorithms, CoExMiner and PathwayPro, to explore static features of gene coexpression and dynamic behaviors of gene networks. CoExMiner is based on B-spline approximation followed by coefficient of determination (CoD) estimation for modeling gene coexpression patterns. The algorithm allows exploration of transcriptional responses that involve coordinated expression of genes encoding proteins that work in concert in the cell. PathwayPro is based on a finite-state Markov chain model for mimicking dynamic behaviors of a transcriptional network. The algorithm allows quantitative assessment of a wide range of network responses, including susceptibility to disease, potential usefulness of a given drug, and consequences of such external stimuli as pharmacological interventions or caloric restriction. We demonstrated the applications of CoExMiner and PathwayPro by examining gene expression profiles of ligands and receptors in cancerous and non-cancerous cells and network dynamics of the leukemia-associated BCR-ABL pathway. The examinations disclosed both linear and nonlinear relationships of ligand-receptor interactions associated with cancer development, identified disease and drug targets of leukemia, and provided new insights into biology of the diseases. The analysis using these newly developed algorithms show the great usefulness of computational systems biology approaches for biological and medical research.
doi:10.1007/978-1-59745-243-4_12
PMCID: PMC2786169  PMID: 19381544
systems biology; coexpression; pathway dynamics; network modeling; coefficient of determination (CoD); Markov chain; transcriptional intervention
9.  Cross Species Transcriptional Profiles Establish a Functional Portrait of Embryonic Stem Cells 
Genomics  2006;89(1):22-35.
An understanding of the regulatory mechanisms responsible for pluripotency in embryonic stem cells (ESCs) is critical for realizing their potential in medicine and science. Significant similarities exist among ESCs harvested from different species, yet major differences have also been observed. Here, by cross-species analysis of a large set of functional categories and all transcription factors and growth factors, we revealed conserved and divergent functional landscapes underlining fundamental and species-specific mechanisms that regulate ESC development. Global transcriptional trends derived from all expressed genes, instead of differentially expressed genes alone, were examined, allowing for a higher discriminating power in the functional portrait. We demonstrate that cross-species correlation of transcriptional changes that occur upon ESC differentiation is a powerful predictor of ESC-important biological pathways and functional cores within a pathway. Hundreds of functional modules, as defined by Gene Ontology, were associated with conserved expression patterns but bear no overt relationship to ESC development, suggestive of new mechanisms critical to ESC pluripotency. Yet other functional modules were not conserved; instead, they were significantly up-regulated in ESCs of either species, suggestive of species-specific regulation. The comparison of ESCs across species and between human ESCs and embryonal carcinoma stem cells (ECCs) suggest that while pluripotency as an essential function in multicellular organisms is conserved through evolution, mechanisms primed for differentiation are less conserved and contribute substantially to the differences among stem cells derived from different tissues or species. Our findings establish a basis for defining the “stemness” properties of ESCs from the perspective of functional conservation and variation. The data and analyses resulting from this study provide a framework for new hypotheses and research directions, and a public resource for functional genomics of ESCs.
doi:10.1016/j.ygeno.2006.09.010
PMCID: PMC2658876  PMID: 17055697
embryonic stem cell; embryonic bodies; comparative transcriptomics; functional genomics; LIF; FGF; Nodal; Wnt; BMP; TGF; β
10.  Evolutionarily Conserved Transcriptional Co-Expression Guiding Embryonic Stem Cell Differentiation 
PLoS ONE  2008;3(10):e3406.
Background
Understanding the molecular mechanisms controlling pluripotency in embryonic stem cells (ESCs) is of central importance towards realizing their potentials in medicine and science. Cross-species examination of transcriptional co-expression allows elucidation of fundamental and species-specific mechanisms regulating ESC self-renewal or differentiation.
Methodology/Principal Findings
We examined transcriptional co-expression of ESCs from pathways to global networks under the framework of human-mouse comparisons. Using generalized singular value decomposition and comparative partition around medoids algorithms, evolutionarily conserved and divergent transcriptional co-expression regulating pluripotency were identified from ESC-critical pathways including ACTIVIN/NODAL, ATK/PTEN, BMP, CELL CYCLE, JAK/STAT, PI3K, TGFβ and WNT. A set of transcription factors, including FOX, GATA, MYB, NANOG, OCT, PAX, SOX and STAT, and the FGF response element were identified that represent key regulators underlying the transcriptional co-expression. By transcriptional intervention conducted in silico, dynamic behavior of pathways was examined, which demonstrate how much and in which specific ways each gene or gene combination effects the behavior transition of a pathway in response to ESC differentiation or pluripotency induction. The global co-expression networks of ESCs were dominated by highly connected hub genes such as IGF2, JARID2, LCK, MYCN, NASP, OCT4, ORC1L, PHC1 and RUVBL1, which are possibly critical in determining the fate of ESCs.
Conclusions/Significance
Through these studies, evolutionary conservation at genomic, transcriptomic, and network levels is shown to be an effective predictor of molecular factors and mechanisms controlling ESC development. Various hypotheses regarding mechanisms controlling ESC development were generated, which could be further validated by in vitro experiments. Our findings shed light on the systems-level understanding of how ESC differentiation or pluripotency arises from the connectivity or networks of genes, and provide a “road-map” for further experimental investigation.
doi:10.1371/journal.pone.0003406
PMCID: PMC2566604  PMID: 18923680
11.  caBIG™ VISDA: Modeling, visualization, and discovery for cluster analysis of genomic data 
BMC Bioinformatics  2008;9:383.
Background
The main limitations of most existing clustering methods used in genomic data analysis include heuristic or random algorithm initialization, the potential of finding poor local optima, the lack of cluster number detection, an inability to incorporate prior/expert knowledge, black-box and non-adaptive designs, in addition to the curse of dimensionality and the discernment of uninformative, uninteresting cluster structure associated with confounding variables.
Results
In an effort to partially address these limitations, we develop the VIsual Statistical Data Analyzer (VISDA) for cluster modeling, visualization, and discovery in genomic data. VISDA performs progressive, coarse-to-fine (divisive) hierarchical clustering and visualization, supported by hierarchical mixture modeling, supervised/unsupervised informative gene selection, supervised/unsupervised data visualization, and user/prior knowledge guidance, to discover hidden clusters within complex, high-dimensional genomic data. The hierarchical visualization and clustering scheme of VISDA uses multiple local visualization subspaces (one at each node of the hierarchy) and consequent subspace data modeling to reveal both global and local cluster structures in a "divide and conquer" scenario. Multiple projection methods, each sensitive to a distinct type of clustering tendency, are used for data visualization, which increases the likelihood that cluster structures of interest are revealed. Initialization of the full dimensional model is based on first learning models with user/prior knowledge guidance on data projected into the low-dimensional visualization spaces. Model order selection for the high dimensional data is accomplished by Bayesian theoretic criteria and user justification applied via the hierarchy of low-dimensional visualization subspaces. Based on its complementary building blocks and flexible functionality, VISDA is generally applicable for gene clustering, sample clustering, and phenotype clustering (wherein phenotype labels for samples are known), albeit with minor algorithm modifications customized to each of these tasks.
Conclusion
VISDA achieved robust and superior clustering accuracy, compared with several benchmark clustering schemes. The model order selection scheme in VISDA was shown to be effective for high dimensional genomic data clustering. On muscular dystrophy data and muscle regeneration data, VISDA identified biologically relevant co-expressed gene clusters. VISDA also captured the pathological relationships among different phenotypes revealed at the molecular level, through phenotype clustering on muscular dystrophy data and multi-category cancer data.
doi:10.1186/1471-2105-9-383
PMCID: PMC2566986  PMID: 18801195
12.  Unraveling transcriptional regulatory programs by integrative analysis of microarray and transcription factor binding data 
Bioinformatics  2008;24(17):1874-1880.
Motivation: Unraveling the transcriptional regulatory program mediated by transcription factors (TFs) is a fundamental objective of computational biology, yet still remains a challenge.
Method: Here, we present a new methodology that integrates microarray and TF binding data for unraveling transcriptional regulatory networks. The algorithm is based on a two-stage constrained matrix decomposition model. The model takes into account the non-linear structure in gene expression data, particularly in the TF-target gene interactions and the combinatorial nature of gene regulation by TFs. The gene expression profile is modeled as a linear weighted combination of the activity profiles of a set of TFs. The TF activity profiles are deduced from the expression levels of TF target genes, instead directly from TFs themselves. The TF-target gene relationships are derived from ChIP-chip and other TF binding data. The proposed algorithm can not only identify transcriptional modules, but also reveal regulatory programs of which TFs control which target genes in which specific ways (either activating or inhibiting).
Results: In comparison with other methods, our algorithm identifies biologically more meaningful transcriptional modules relating to specific TFs. We applied the new algorithm on yeast cell cycle and stress response data. While known transcriptional regulations were confirmed, novel TF-gene interactions were predicted and provide new insights into the regulatory mechanisms of the cell.
Contact: zhanmi@mail.nih.gov
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btn332
PMCID: PMC2519161  PMID: 18586698
13.  Elucidation of a C-Rich Signature Motif in Target mRNAs of RNA-Binding Protein TIAR▿ †  
Molecular and Cellular Biology  2007;27(19):6806-6817.
The RNA-binding protein TIAR (related to TIA-1 [T-cell-restricted intracellular antigen 1]) was shown to associate with subsets of mRNAs bearing U-rich sequences in their 3′ untranslated regions. TIAR can function as a translational repressor, particularly in response to cytotoxic agents. Using unstressed colon cancer cells, collections of mRNAs associated with TIAR were isolated by immunoprecipitation (IP) of (TIAR-RNA) ribonucleoprotein (RNP) complexes, identified by microarray analysis, and used to elucidate a common signature motif present among TIAR target transcripts. The predicted TIAR motif was an unexpectedly cytosine-rich, 28- to 32-nucleotide-long element forming a stem and a loop of variable size with an additional side loop. The ability of TIAR to bind an RNA oligonucleotide with a representative C-rich TIAR motif sequence was verified in vitro using surface plasmon resonance. By this analysis, TIAR containing two or three RNA recognition domains (TIAR12 and TIAR123) showed low but significant binding to the C-rich sequence. In vivo, insertion of the C-rich motif into a heterologous reporter strongly suppressed its translation in cultured cells. Using this signature motif, an additional ∼2,209 UniGene targets were identified (2.0% of the total UniGene database). A subset of specific mRNAs were validated by RNP IP analysis. Interestingly, in response to treatment with short-wavelength UV light (UVC), a stress agent causing DNA damage, each of these target mRNAs bearing C-rich motifs dissociated from TIAR. In turn, expression of the encoded proteins was elevated in a TIAR-dependent manner. In sum, we report the identification of a C-rich signature motif present in TIAR target mRNAs whose association with TIAR decreases following exposure to a stress-causing agent.
doi:10.1128/MCB.01036-07
PMCID: PMC2099219  PMID: 17682065
14.  Gene Module Identification from Microarray Data Using Nonnegative Independent Component Analysis 
Genes mostly interact with each other to form transcriptional modules for performing single or multiple functions. It is important to unravel such transcriptional modules and to determine how disturbances in them may lead to disease. Here, we propose a non-negative independent component analysis (nICA) approach for transcriptional module discovery. nICA method utilizes the non-negativity constraint to enforce the independence of biological processes within the participated genes. In such, nICA decomposes the observed gene expression into positive independent components, which fits better to the reality of corresponding putative biological processes. In conjunction with nICA modeling, visual statistical data analyzer (VISDA) is applied to group genes into modules in latent variable space. We demonstrate the usefulness of the approach through the identification of composite modules from yeast data and the discovery of pathway modules in muscle regeneration.
PMCID: PMC2759148  PMID: 19936101
transcriptional module; gene module identification; non-negative independent component analysis (nICA); visual statistical data analyzer (VISDA); muscle regeneration; microarray data analysisp
15.  Gene expression atlas of the mouse central nervous system: impact and interactions of age, energy intake and gender 
Genome Biology  2007;8(11):R234.
The transcriptional profiles of five regions of the central nervous system (CNS) of mice varying in age, gender and dietary intake were measured by microarray. The resulting data provide insights into the mechanisms of age-, diet- and gender-related CNS plasticity and vulnerability in mammals.
Background
The structural and functional complexity of the mammalian central nervous system (CNS) is organized and modified by complicated molecular signaling processes that are poorly understood.
Results
We measured transcripts of 16,896 genes in 5 CNS regions from cohorts of young, middle-aged and old male and female mice that had been maintained on either a control diet or a low energy diet known to retard aging. Each CNS region (cerebral cortex, hippocampus, striatum, cerebellum and spinal cord) possessed its own unique transcriptome fingerprint that was independent of age, gender and energy intake. Less than 10% of genes were significantly affected by age, diet or gender, with most of these changes occurring between middle and old age. The transcriptome of the spinal cord was the most responsive to age, diet and gender, while the striatal transcriptome was the least responsive. Gender and energy restriction had particularly robust influences on the hippocampal transcriptome of middle-aged mice. Prominent functional groups of age- and energy-sensitive genes were those encoding proteins involved in DNA damage responses (Werner and telomere-associated proteins), mitochondrial and proteasome functions, cell fate determination (Wnt and Notch signaling) and synaptic vesicle trafficking.
Conclusion
Mouse CNS transcriptomes responded to age, energy intake and gender in a regionally distinctive manner. The systematic transcriptome dataset also provides a window into mechanisms of age-, diet- and sex-related CNS plasticity and vulnerability.
doi:10.1186/gb-2007-8-11-r234
PMCID: PMC2258177  PMID: 17988385
16.  Analysis of Gene Coexpression by B-Spline Based CoD Estimation 
The gene coexpression study has emerged as a novel holistic approach for microarray data analysis. Different indices have been used in exploring coexpression relationship, but each is associated with certain pitfalls. The Pearson's correlation coefficient, for example, is not capable of uncovering nonlinear pattern and directionality of coexpression. Mutual information can detect nonlinearity but fails to show directionality. The coefficient of determination (CoD) is unique in exploring different patterns of gene coexpression, but so far only applied to discrete data and the conversion of continuous microarray data to the discrete format could lead to information loss. Here, we proposed an effective algorithm, CoexPro, for gene coexpression analysis. The new algorithm is based on B-spline approximation of coexpression between a pair of genes, followed by CoD estimation. The algorithm was justified by simulation studies and by functional semantic similarity analysis. The proposed algorithm is capable of uncovering both linear and a specific class of nonlinear relationships from continuous microarray data. It can also provide suggestions for possible directionality of coexpression to the researchers. The new algorithm presents a novel model for gene coexpression and will be a valuable tool for a variety of gene expression and network studies. The application of the algorithm was demonstrated by an analysis on ligand-receptor coexpression in cancerous and noncancerous cells. The software implementing the algorithm is available upon request to the authors.
doi:10.1155/2007/49478
PMCID: PMC3171342  PMID: 17846662
17.  Genome wide profiling of human embryonic stem cells (hESCs), their derivatives and embryonal carcinoma cells to develop base profiles of U.S. Federal government approved hESC lines 
Background
In order to compare the gene expression profiles of human embryonic stem cell (hESC) lines and their differentiated progeny and to monitor feeder contaminations, we have examined gene expression in seven hESC lines and human fibroblast feeder cells using Illumina® bead arrays that contain probes for 24,131 transcript probes.
Results
A total of 48 different samples (including duplicates) grown in multiple laboratories under different conditions were analyzed and pairwise comparisons were performed in all groups. Hierarchical clustering showed that blinded duplicates were correctly identified as the closest related samples. hESC lines clustered together irrespective of the laboratory in which they were maintained. hESCs could be readily distinguished from embryoid bodies (EB) differentiated from them and the karyotypically abnormal hESC line BG01V. The embryonal carcinoma (EC) line NTera2 is a useful model for evaluating characteristics of hESCs. Expression of subsets of individual genes was validated by comparing with published databases, MPSS (Massively Parallel Signature Sequencing) libraries, and parallel analysis by microarray and RT-PCR.
Conclusion
we show that Illumina's bead array platform is a reliable, reproducible and robust method for developing base global profiles of cells and identifying similarities and differences in large number of samples.
doi:10.1186/1471-213X-6-20
PMCID: PMC1523200  PMID: 16672070
18.  Transcriptome coexpression map of human embryonic stem cells 
BMC Genomics  2006;7:103.
Background
Human embryonic stem (ES) cells hold great promise for medicine and science. The transcriptome of human ES cells has been studied in detail in recent years. However, no systematic analysis has yet addressed whether gene expression in human ES cells may be regulated in chromosomal domains, and no chromosomal domains of coexpression have been identified.
Results
We report the first transcriptome coexpression map of the human ES cell and the earliest stage of ES differentiation, the embryoid body (EB), for the analysis of how transcriptional regulation interacts with genomic structure during ES self-renewal and differentiation. We determined the gene expression profiles from multiple ES and EB samples and identified chromosomal domains showing coexpression of adjacent genes on the genome. The coexpression domains were not random, with significant enrichment in chromosomes 8, 11, 16, 17, 19, and Y in the ES state, and 6, 11, 17, 19 and 20 in the EB state. The domains were significantly associated with Giemsa-negative bands in EB, yet showed little correlation with known cytogenetic structures in ES cells. Different patterns of coexpression were revealed by comparative transcriptome mapping between ES and EB.
Conclusion
The findings and methods reported in this investigation advance our understanding of how genome organization affects gene expression in human ES cells and help to identify new mechanisms and pathways controlling ES self-renewal or differentiation.
doi:10.1186/1471-2164-7-103
PMCID: PMC1523211  PMID: 16670017

Results 1-18 (18)