PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (86)
 

Clipboard (0)
None
Journals
Year of Publication
1.  Early peroxisome proliferator-activated receptor gamma regulated genes involved in expansion of pancreatic beta cell mass 
BMC Medical Genomics  2011;4:86.
Background
The progression towards type 2 diabetes depends on the allostatic response of pancreatic beta cells to synthesise and secrete enough insulin to compensate for insulin resistance. The endocrine pancreas is a plastic tissue able to expand or regress in response to the requirements imposed by physiological and pathophysiological states associated to insulin resistance such as pregnancy, obesity or ageing, but the mechanisms mediating beta cell mass expansion in these scenarios are not well defined. We have recently shown that ob/ob mice with genetic ablation of PPARγ2, a mouse model known as the POKO mouse failed to expand its beta cell mass. This phenotype contrasted with the appropriate expansion of the beta cell mass observed in their obese littermate ob/ob mice. Thus, comparison of these models islets particularly at early ages could provide some new insights on early PPARγ dependent transcriptional responses involved in the process of beta cell mass expansion
Results
Here we have investigated PPARγ dependent transcriptional responses occurring during the early stages of beta cell adaptation to insulin resistance in wild type, ob/ob, PPARγ2 KO and POKO mice. We have identified genes known to regulate both the rate of proliferation and the survival signals of beta cells. Moreover we have also identified new pathways induced in ob/ob islets that remained unchanged in POKO islets, suggesting an important role for PPARγ in maintenance/activation of mechanisms essential for the continued function of the beta cell.
Conclusions
Our data suggest that the expansion of beta cell mass observed in ob/ob islets is associated with the activation of an immune response that fails to occur in POKO islets. We have also indentified other PPARγ dependent differentially regulated pathways including cholesterol biosynthesis, apoptosis through TGF-β signaling and decreased oxidative phosphorylation.
doi:10.1186/1755-8794-4-86
PMCID: PMC3315430  PMID: 22208362
2.  Allele-specific disparity in breast cancer 
BMC Medical Genomics  2011;4:85.
Background
In a cancer cell the number of copies of a locus may vary due to amplification and deletion and these variations are denoted as copy number alterations (CNAs). We focus on the disparity of CNAs in tumour samples, which were compared to those in blood in order to identify the directional loss of heterozygosity.
Methods
We propose a numerical algorithm and apply it to data from the Illumina 109K-SNP array on 112 samples from breast cancer patients. B-allele frequency (BAF) and log R ratio (LRR) of Illumina were used to estimate Euclidian distances. For each locus, we compared genotypes in blood and tumour for subset of samples being heterozygous in blood. We identified loci showing preferential disparity from heterozygous toward either the A/B-allele homozygous (allelic disparity). The chi-squared and Cochran-Armitage trend tests were used to examine whether there is an association between high levels of disparity in single nucleotide polymorphisms (SNPs) and molecular, clinical and tumour-related parameters. To identify pathways and network functions over-represented within the resulting gene sets, we used Ingenuity Pathway Analysis (IPA).
Results
To identify loci with a high level of disparity, we selected SNPs 1) with a substantial degree of disparity and 2) with substantial frequency (at least 50% of the samples heterozygous for the respective locus). We report the overall difference in disparity in high-grade tumours compared to low-grade tumours (p-value < 0.001) and significant associations between disparity in multiple single loci and clinical parameters. The most significantly associated network functions within the genes represented in the loci of disparity were identified, including lipid metabolism, small-molecule biochemistry, and nervous system development and function. No evidence for over-representation of directional disparity in a list of stem cell genes was obtained, however genes appeared to be more often altered by deletion than by amplification.
Conclusions
Our data suggest that directional loss and amplification exist in breast cancer. These are highly associated with grade, which may indicate that they are enforced with increasing number of cell divisions. Whether there is selective pressure for some loci to be preferentially amplified or deleted remains to be confirmed.
doi:10.1186/1755-8794-4-85
PMCID: PMC3337547  PMID: 22188678
3.  Batch effect correction for genome-wide methylation data with Illumina Infinium platform 
BMC Medical Genomics  2011;4:84.
Background
Genome-wide methylation profiling has led to more comprehensive insights into gene regulation mechanisms and potential therapeutic targets. Illumina Human Methylation BeadChip is one of the most commonly used genome-wide methylation platforms. Similar to other microarray experiments, methylation data is susceptible to various technical artifacts, particularly batch effects. To date, little attention has been given to issues related to normalization and batch effect correction for this kind of data.
Methods
We evaluated three common normalization approaches and investigated their performance in batch effect removal using three datasets with different degrees of batch effects generated from HumanMethylation27 platform: quantile normalization at average β value (QNβ); two step quantile normalization at probe signals implemented in "lumi" package of R (lumi); and quantile normalization of A and B signal separately (ABnorm). Subsequent Empirical Bayes (EB) batch adjustment was also evaluated.
Results
Each normalization could remove a portion of batch effects and their effectiveness differed depending on the severity of batch effects in a dataset. For the dataset with minor batch effects (Dataset 1), normalization alone appeared adequate and "lumi" showed the best performance. However, all methods left substantial batch effects intact in the datasets with obvious batch effects and further correction was necessary. Without any correction, 50 and 66 percent of CpGs were associated with batch effects in Dataset 2 and 3, respectively. After QNβ, lumi or ABnorm, the number of CpGs associated with batch effects were reduced to 24, 32, and 26 percent for Dataset 2; and 37, 46, and 35 percent for Dataset 3, respectively. Additional EB correction effectively removed such remaining non-biological effects. More importantly, the two-step procedure almost tripled the numbers of CpGs associated with the outcome of interest for the two datasets.
Conclusion
Genome-wide methylation data from Infinium Methylation BeadChip can be susceptible to batch effects with profound impacts on downstream analyses and conclusions. Normalization can reduce part but not all batch effects. EB correction along with normalization is recommended for effective batch effect removal.
doi:10.1186/1755-8794-4-84
PMCID: PMC3265417  PMID: 22171553
4.  Transforming growth factor β receptor 1 is a new candidate prognostic biomarker after acute myocardial infarction 
BMC Medical Genomics  2011;4:83.
Background
Prediction of left ventricular (LV) remodeling after acute myocardial infarction (MI) is clinically important and would benefit from the discovery of new biomarkers.
Methods
Blood samples were obtained upon admission in patients with acute ST-elevation MI who underwent primary percutaneous coronary intervention. Messenger RNA was extracted from whole blood cells. LV function was evaluated by echocardiography at 4-months.
Results
In a test cohort of 32 MI patients, integrated analysis of microarrays with a network of protein-protein interactions identified subgroups of genes which predicted LV dysfunction (ejection fraction ≤ 40%) with areas under the receiver operating characteristic curve (AUC) above 0.80. Candidate genes included transforming growth factor beta receptor 1 (TGFBR1). In a validation cohort of 115 MI patients, TGBFR1 was up-regulated in patients with LV dysfunction (P < 0.001) and was associated with LV function at 4-months (P = 0.003). TGFBR1 predicted LV function with an AUC of 0.72, while peak levels of troponin T (TnT) provided an AUC of 0.64. Adding TGFBR1 to the prediction of TnT resulted in a net reclassification index of 8.2%. When added to a mixed clinical model including age, gender and time to reperfusion, TGFBR1 reclassified 17.7% of misclassified patients. TGFB1, the ligand of TGFBR1, was also up-regulated in patients with LV dysfunction (P = 0.004), was associated with LV function (P = 0.006), and provided an AUC of 0.66. In the rat MI model induced by permanent coronary ligation, the TGFB1-TGFBR1 axis was activated in the heart and correlated with the extent of remodeling at 2 months.
Conclusions
We identified TGFBR1 as a new candidate prognostic biomarker after acute MI.
doi:10.1186/1755-8794-4-83
PMCID: PMC3240818  PMID: 22136666
5.  Identification of DNA methylation changes associated with human gastric cancer 
BMC Medical Genomics  2011;4:82.
Background
Epigenetic alteration of gene expression is a common event in human cancer. DNA methylation is a well-known epigenetic process, but verifying the exact nature of epigenetic changes associated with cancer remains difficult.
Methods
We profiled the methylome of human gastric cancer tissue at 50-bp resolution using a methylated DNA enrichment technique (methylated CpG island recovery assay) in combination with a genome analyzer and a new normalization algorithm.
Results
We were able to gain a comprehensive view of promoters with various CpG densities, including CpG Islands (CGIs), transcript bodies, and various repeat classes. We found that gastric cancer was associated with hypermethylation of 5' CGIs and the 5'-end of coding exons as well as hypomethylation of repeat elements, such as short interspersed nuclear elements and the composite element SVA. Hypermethylation of 5' CGIs was significantly correlated with downregulation of associated genes, such as those in the HOX and histone gene families. We also discovered long-range epigenetic silencing (LRES) regions in gastric cancer tissue and identified several hypermethylated genes (MDM2, DYRK2, and LYZ) within these regions. The methylation status of CGIs and gene annotation elements in metastatic lymph nodes was intermediate between normal and cancerous tissue, indicating that methylation of specific genes is gradually increased in cancerous tissue.
Conclusions
Our findings will provide valuable data for future analysis of CpG methylation patterns, useful markers for the diagnosis of stomach cancer, as well as a new analysis method for clinical epigenomics investigations.
doi:10.1186/1755-8794-4-82
PMCID: PMC3273443  PMID: 22133303
6.  Estimates of array and pool-construction variance for planning efficient DNA-pooling genome wide association studies 
BMC Medical Genomics  2011;4:81.
Background
Until recently, genome-wide association studies (GWAS) have been restricted to research groups with the budget necessary to genotype hundreds, if not thousands, of samples. Replacing individual genotyping with genotyping of DNA pools in Phase I of a GWAS has proven successful, and dramatically altered the financial feasibility of this approach. When conducting a pool-based GWAS, how well SNP allele frequency is estimated from a DNA pool will influence a study's power to detect associations. Here we address how to control the variance in allele frequency estimation when DNAs are pooled, and how to plan and conduct the most efficient well-powered pool-based GWAS.
Methods
By examining the variation in allele frequency estimation on SNP arrays between and within DNA pools we determine how array variance [var(earray)] and pool-construction variance [var(econstruction)] contribute to the total variance of allele frequency estimation. This information is useful in deciding whether replicate arrays or replicate pools are most useful in reducing variance. Our analysis is based on 27 DNA pools ranging in size from 74 to 446 individual samples, genotyped on a collective total of 128 Illumina beadarrays: 24 1M-Single, 32 1M-Duo, and 72 660-Quad.
Results
For all three Illumina SNP array types our estimates of var(earray) were similar, between 3-4 × 10-4 for normalized data. Var(econstruction) accounted for between 20-40% of pooling variance across 27 pools in normalized data.
Conclusions
We conclude that relative to var(earray), var(econstruction) is of less importance in reducing the variance in allele frequency estimation from DNA pools; however, our data suggests that on average it may be more important than previously thought. We have prepared a simple online tool, PoolingPlanner (available at http://www.kchew.ca/PoolingPlanner/), which calculates the effective sample size (ESS) of a DNA pool given a range of replicate array values. ESS can be used in a power calculator to perform pool-adjusted calculations. This allows one to quickly calculate the loss of power associated with a pooling experiment to make an informed decision on whether a pool-based GWAS is worth pursuing.
doi:10.1186/1755-8794-4-81
PMCID: PMC3247851  PMID: 22122996
7.  Analysis of BMP4 and BMP7 signaling in breast cancer cells unveils time-dependent transcription patterns and highlights a common synexpression group of genes 
BMC Medical Genomics  2011;4:80.
Background
Bone morphogenetic proteins (BMPs) are members of the TGF-beta superfamily of growth factors. They are known for their roles in regulation of osteogenesis and developmental processes and, in recent years, evidence has accumulated of their crucial functions in tumor biology. BMP4 and BMP7, in particular, have been implicated in breast cancer. However, little is known about BMP target genes in the context of tumor. We explored the effects of BMP4 and BMP7 treatment on global gene transcription in seven breast cancer cell lines during a 6-point time series, using a whole-genome oligo microarray. Data analysis included hierarchical clustering of differentially expressed genes, gene ontology enrichment analyses and model based clustering of temporal data.
Results
Both ligands had a strong effect on gene expression, although the response to BMP4 treatment was more pronounced. The cellular functions most strongly affected by BMP signaling were regulation of transcription and development. The observed transcriptional response, as well as its functional outcome, followed a temporal sequence, with regulation of gene expression and signal transduction leading to changes in metabolism and cell proliferation. Hierarchical clustering revealed distinct differences in the response of individual cell lines to BMPs, but also highlighted a synexpression group of genes for both ligands. Interestingly, the majority of the genes within these synexpression groups were shared by the two ligands, probably representing the core molecular responses common to BMP4 and BMP7 signaling pathways.
Conclusions
All in all, we show that BMP signaling has a remarkable effect on gene transcription in breast cancer cells and that the functions affected follow a logical temporal pattern. Our results also uncover components of the common cellular transcriptional response to BMP4 and BMP7. Most importantly, this study provides a list of potential novel BMP target genes relevant in breast cancer.
doi:10.1186/1755-8794-4-80
PMCID: PMC3229454  PMID: 22118688
bone morphogenetic protein; breast cancer; BMP4; BMP7; expression microarray
8.  miRNA signature associated with outcome of gastric cancer patients following chemotherapy 
BMC Medical Genomics  2011;4:79.
Background
Identification of patients who likely will or will not benefit from cytotoxic chemotherapy through the use of biomarkers could greatly improve clinical management by better defining appropriate treatment options for patients. microRNAs may be potentially useful biomarkers that help guide individualized therapy for cancer because microRNA expression is dysregulated in cancer. In order to identify miRNA signatures for gastric cancer and for predicting clinical resistance to cisplatin/fluorouracil (CF) chemotherapy, a comprehensive miRNA microarray analysis was performed using endoscopic biopsy samples.
Methods
Biopsy samples were collected prior to chemotherapy from 90 gastric cancer patients treated with CF and from 34 healthy volunteers. At the time of disease progression, post-treatment samples were additionally collected from 8 clinical responders. miRNA expression was determined using a custom-designed Agilent microarray. In order to identify a miRNA signature for chemotherapy resistance, we correlated miRNA expression levels with the time to progression (TTP) of disease after CF therapy.
Results
A miRNA signature distinguishing gastric cancer from normal stomach epithelium was identified. 30 miRNAs were significantly inversely correlated with TTP whereas 28 miRNAs were significantly positively correlated with TTP of 82 cancer patients (P<0.05). Prominent among the upregulated miRNAs associated with chemosensitivity were miRNAs known to regulate apoptosis, including let-7g, miR-342, miR-16, miR-181, miR-1, and miR-34. When this 58-miRNA predictor was applied to a separate set of pre- and post-treatment tumor samples from the 8 clinical responders, all of the 8 pre-treatment samples were correctly predicted as low-risk, whereas samples from the post-treatment tumors that developed chemoresistance were predicted to be in the high-risk category by the 58 miRNA signature, suggesting that selection for the expression of these miRNAs occurred as chemoresistance arose.
Conclusions
We have identified 1) a miRNA expression signature that distinguishes gastric cancer from normal stomach epithelium from healthy volunteers, and 2) a chemoreresistance miRNA expression signature that is correlated with TTP after CF therapy. The chemoresistance miRNA expression signature includes several miRNAs previously shown to regulate apoptosis in vitro, and warrants further validation.
doi:10.1186/1755-8794-4-79
PMCID: PMC3287139  PMID: 22112324
9.  MicroRNA profiling of diverse endothelial cell types 
BMC Medical Genomics  2011;4:78.
Background
MicroRNAs are ~22-nt long regulatory RNAs that serve as critical modulators of post-transcriptional gene regulation. The diversity of miRNAs in endothelial cells (ECs) and the relationship of this diversity to epithelial and hematologic cells is unknown. We investigated the baseline miRNA signature of human ECs cultured from the aorta (HAEC), coronary artery (HCEC), umbilical vein (HUVEC), pulmonary artery (HPAEC), pulmonary microvasculature (HPMVEC), dermal microvasculature (HDMVEC), and brain microvasculature (HBMVEC) to understand the diversity of miRNA expression in ECs.
Results
We identified 166 expressed miRNAs, of which 3 miRNAs (miR-99b, miR-20b and let-7b) differed significantly between EC types and predicted EC clustering. We confirmed the significance of these miRNAs by RT-PCR analysis and in a second data set by Sylamer analysis. We found wide diversity of miRNAs between endothelial, epithelial and hematologic cells with 99 miRNAs shared across cell types and 31 miRNAs unique to ECs. We show polycistronic miRNA chromosomal clusters have common expression levels within a given cell type.
Conclusions
EC miRNA expression levels are generally consistent across EC types. Three microRNAs were variable within the dataset indicating potential regulatory changes that could impact on EC phenotypic differences. MiRNA expression in endothelial, epithelial and hematologic cells differentiate these cell types. This data establishes a valuable resource characterizing the diverse miRNA signature of ECs.
doi:10.1186/1755-8794-4-78
PMCID: PMC3223144  PMID: 22047531
miR-99b; miR-20b; let-7b
10.  Gene expression profiles of breast biopsies from healthy women identify a group with claudin-low features 
BMC Medical Genomics  2011;4:77.
Background
Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk.
Methods
Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets.
Results
Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions.
Conclusion
This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer.
doi:10.1186/1755-8794-4-77
PMCID: PMC3216859  PMID: 22044755
Gene expression; normal breast tissue; hierarchical clustering; claudin-low
11.  Gene network analyses point to the importance of human tissue kallikreins in melanoma progression 
BMC Medical Genomics  2011;4:76.
Background
A wide variety of high-throughput microarray platforms have been used to identify molecular targets associated with biological and clinical tumor phenotypes by comparing samples representing distinct pathological states.
Methods
The gene expression profiles of human cutaneous melanomas were determined by cDNA microarray analysis. Next, a robust analysis to determine functional classifications and make predictions based on data-oriented hypotheses was performed. Relevant networks that may be implicated in melanoma progression were also considered.
Results
In this study we aimed to analyze coordinated gene expression changes to find molecular pathways involved in melanoma progression. To achieve this goal, ontologically-linked modules with coordinated expression changes in melanoma samples were identified. With this approach, we detected several gene networks related to different modules that were induced or repressed during melanoma progression. Among them we observed high coordinated expression levels of genes involved in a) cell communication (KRT4, VWF and COMP); b) epidermal development (KLK7, LAMA3 and EVPL); and c) functionally related to kallikreins (EVPL, KLK6, KLK7, KLK8, SERPINB13, SERPING1 and SLPI). Our data also indicated that hKLK7 protein expression was significantly associated with good prognosis and survival.
Conclusions
Our findings, derived from a different type of analysis of microarray data, highlight the importance of analyzing coordinated gene expression to find molecular pathways involved in melanoma progression.
doi:10.1186/1755-8794-4-76
PMCID: PMC3212933  PMID: 22032772
12.  Identification of gene fusion transcripts by transcriptome sequencing in BRCA1-mutated breast cancers and cell lines 
BMC Medical Genomics  2011;4:75.
Background
Gene fusions arising from chromosomal translocations have been implicated in cancer. However, the role of gene fusions in BRCA1-related breast cancers is not well understood. Mutations in BRCA1 are associated with an increased risk for breast cancer (up to 80% lifetime risk) and ovarian cancer (up to 50%). We sought to identify putative gene fusions in the transcriptomes of these cancers using high-throughput RNA sequencing (RNA-Seq).
Methods
We used Illumina sequencing technology to sequence the transcriptomes of five BRCA1-mutated breast cancer cell lines, three BRCA1-mutated primary tumors, two secretory breast cancer primary tumors and one non-tumorigenic breast epithelial cell line. Using a bioinformatics approach, our initial attempt at discovering putative gene fusions relied on analyzing single-end reads and identifying reads that aligned across exons of two different genes. Subsequently, latter samples were sequenced with paired-end reads and at longer cycles (producing longer reads). We then refined our approach by identifying misaligned paired reads, which may flank a putative gene fusion junction.
Results
As a proof of concept, we were able to identify two previously characterized gene fusions in our samples using both single-end and paired-end approaches. In addition, we identified three novel in-frame fusions, but none were recurrent. Two of the candidates, WWC1-ADRBK2 in HCC3153 cell line and ADNP-C20orf132 in a primary tumor, were confirmed by Sanger sequencing and RT-PCR. RNA-Seq expression profiling of these two fusions showed a distinct overexpression of the 3' partner genes, suggesting that its expression may be under the control of the 5' partner gene's regulatory elements.
Conclusions
In this study, we used both single-end and paired-end sequencing strategies to discover gene fusions in breast cancer transcriptomes with BRCA1 mutations. We found that the use of paired-end reads is an effective tool for transcriptome profiling of gene fusions. Our findings suggest that while gene fusions are present in some BRCA1-mutated breast cancers, they are infrequent and not recurrent. However, private fusions may still be valuable as potential patient-specific biomarkers for diagnosis and treatment.
doi:10.1186/1755-8794-4-75
PMCID: PMC3227591  PMID: 22032724
13.  Microarray analysis of peripheral blood lymphocytes from ALS patients and the SAFE detection of the KEGG ALS pathway 
BMC Medical Genomics  2011;4:74.
Background
Sporadic amyotrophic lateral sclerosis (sALS) is a motor neuron disease with poorly understood etiology. Results of gene expression profiling studies of whole blood from ALS patients have not been validated and are difficult to relate to ALS pathogenesis because gene expression profiles depend on the relative abundance of the different cell types present in whole blood. We conducted microarray analyses using Agilent Human Whole Genome 4 × 44k Arrays on a more homogeneous cell population, namely purified peripheral blood lymphocytes (PBLs), from ALS patients and healthy controls to identify molecular signatures possibly relevant to ALS pathogenesis.
Methods
Differentially expressed genes were determined by LIMMA (Linear Models for MicroArray) and SAM (Significance Analysis of Microarrays) analyses. The SAFE (Significance Analysis of Function and Expression) procedure was used to identify molecular pathway perturbations. Proteasome inhibition assays were conducted on cultured peripheral blood mononuclear cells (PBMCs) from ALS patients to confirm alteration of the Ubiquitin/Proteasome System (UPS).
Results
For the first time, using SAFE in a global gene ontology analysis (gene set size 5-100), we show significant perturbation of the KEGG (Kyoto Encyclopedia of Genes and Genomes) ALS pathway of motor neuron degeneration in PBLs from ALS patients. This was the only KEGG disease pathway significantly upregulated among 25, and contributing genes, including SOD1, represented 54% of the encoded proteins or protein complexes of the KEGG ALS pathway. Further SAFE analysis, including gene set sizes >100, showed that only neurodegenerative diseases (4 out of 34 disease pathways) including ALS were significantly upregulated. Changes in UBR2 expression correlated inversely with time since onset of disease and directly with ALSFRS-R, implying that UBR2 was increased early in the course of ALS. Cultured PBMCs from ALS patients accumulated more ubiquitinated proteins than PBMCs from healthy controls in a serum-dependent manner confirming changes in this pathway.
Conclusions
Our study indicates that PBLs from sALS patients are strong responders to systemic signals or local signals acquired by cell trafficking, representing changes in gene expression similar to those present in brain and spinal cord of sALS patients. PBLs may provide a useful means to study ALS pathogenesis.
doi:10.1186/1755-8794-4-74
PMCID: PMC3219589  PMID: 22027401
14.  Quantifying stability in gene list ranking across microarray derived clinical biomarkers 
BMC Medical Genomics  2011;4:73.
Background
Identifying stable gene lists for diagnosis, prognosis prediction, and treatment guidance of tumors remains a major challenge in cancer research. Microarrays measuring differential gene expression are widely used and should be versatile predictors of disease and other phenotypic data. However, gene expression profile studies and predictive biomarkers are often of low power, requiring numerous samples for a sound statistic, or vary between studies. Given the inconsistency of results across similar studies, methods that identify robust biomarkers from microarray data are needed to relay true biological information. Here we present a method to demonstrate that gene list stability and predictive power depends not only on the size of studies, but also on the clinical phenotype.
Results
Our method projects genomic tumor expression data to a lower dimensional space representing the main variation in the data. Some information regarding the phenotype resides in this low dimensional space, while some information resides in the residuum. We then introduce an information ratio (IR) as a metric defined by the partition between projected and residual space. Upon grouping phenotypes such as tumor tissue, histological grades, relapse, or aging, we show that higher IR values correlated with phenotypes that yield less robust biomarkers whereas lower IR values showed higher transferability across studies. Our results indicate that the IR is correlated with predictive accuracy. When tested across different published datasets, the IR can identify information-rich data characterizing clinical phenotypes and stable biomarkers.
Conclusions
The IR presents a quantitative metric to estimate the information content of gene expression data with respect to particular phenotypes.
doi:10.1186/1755-8794-4-73
PMCID: PMC3206838  PMID: 21996057
15.  "Who owns your poop?": insights regarding the intersection of human microbiome research and the ELSI aspects of biobanking and related studies 
BMC Medical Genomics  2011;4:72.
Background
While the social, ethical, and legal implications of biobanking and large scale data sharing are already complicated enough, they may be further compounded by research on the human microbiome.
Discussion
The human microbiome is the entire complement of microorganisms that exists in and on every human body. Currently most biobanks focus primarily on human tissues and/or associated data (e.g. health records). Accordingly, most discussions in the social sciences and humanities on these issues are focused (appropriately so) on the implications of biobanks and sharing data derived from human tissues. However, rapid advances in human microbiome research involve collecting large amounts of data on microorganisms that exist in symbiotic relationships with the human body. Currently it is not clear whether these microorganisms should be considered part of or separate from the human body. Arguments can be made for both, but ultimately it seems that the dichotomy of human versus non-human and self versus non-self inevitably breaks down in this context. This situation has the potential to add further complications to debates on biobanking.
Summary
In this paper, we revisit some of the core problem areas of privacy, consent, ownership, return of results, governance, and benefit sharing, and consider how they might be impacted upon by human microbiome research. Some of the issues discussed also have relevance to other forms of microbial research. Discussion of these themes is guided by conceptual analysis of microbiome research and interviews with leading Canadian scientists in the field.
doi:10.1186/1755-8794-4-72
PMCID: PMC3199231  PMID: 21982589
human microbiome; health research; consent; privacy; ownership; return of results; policy; biobanks; ELSI; research ethics
16.  Comparative analysis of the human hepatic and adipose tissue transcriptomes during LPS-induced inflammation leads to the identification of differential biological pathways and candidate biomarkers 
BMC Medical Genomics  2011;4:71.
Background
Insulin resistance (IR) is accompanied by chronic low grade systemic inflammation, obesity, and deregulation of total body energy homeostasis. We induced inflammation in adipose and liver tissues in vitro in order to mimic inflammation in vivo with the aim to identify tissue-specific processes implicated in IR and to find biomarkers indicative for tissue-specific IR.
Methods
Human adipose and liver tissues were cultured in the absence or presence of LPS and DNA Microarray Technology was applied for their transcriptome analysis. Gene Ontology (GO), gene functional analysis, and prediction of genes encoding for secretome were performed using publicly available bioinformatics tools (DAVID, STRING, SecretomeP). The transcriptome data were validated by proteomics analysis of the inflamed adipose tissue secretome.
Results
LPS treatment significantly affected 667 and 483 genes in adipose and liver tissues respectively. The GO analysis revealed that during inflammation adipose tissue, compared to liver tissue, had more significantly upregulated genes, GO terms, and functional clusters related to inflammation and angiogenesis. The secretome prediction led to identification of 399 and 236 genes in adipose and liver tissue respectively. The secretomes of both tissues shared 66 genes and the remaining genes were the differential candidate biomarkers indicative for inflamed adipose or liver tissue. The transcriptome data of the inflamed adipose tissue secretome showed excellent correlation with the proteomics data.
Conclusions
The higher number of altered proinflammatory genes, GO processes, and genes encoding for secretome during inflammation in adipose tissue compared to liver tissue, suggests that adipose tissue is the major organ contributing to the development of systemic inflammation observed in IR. The identified tissue-specific functional clusters and biomarkers might be used in a strategy for the development of tissue-targeted treatment of insulin resistance in patients.
doi:10.1186/1755-8794-4-71
PMCID: PMC3196688  PMID: 21978410
17.  Bayesian probit regression model for the diagnosis of pulmonary fibrosis: proof-of-principle 
BMC Medical Genomics  2011;4:70.
Background
The accurate diagnosis of idiopathic pulmonary fibrosis (IPF) is a major clinical challenge. We developed a model to diagnose IPF by applying Bayesian probit regression (BPR) modelling to gene expression profiles of whole lung tissue.
Methods
Whole lung tissue was obtained from patients with idiopathic pulmonary fibrosis (IPF) undergoing surgical lung biopsy or lung transplantation. Controls were obtained from normal organ donors. We performed cluster analyses to explore differences in our dataset. No significant difference was found between samples obtained from different lobes of the same patient. A significant difference was found between samples obtained at biopsy versus explant. Following preliminary analysis of the complete dataset, we selected three subsets for the development of diagnostic gene signatures: the first signature was developed from all IPF samples (as compared to controls); the second signature was developed from the subset of IPF samples obtained at biopsy; the third signature was developed from IPF explants. To assess the validity of each signature, we used an independent cohort of IPF and normal samples. Each signature was used to predict phenotype (IPF versus normal) in samples from the validation cohort. We compared the models' predictions to the true phenotype of each validation sample, and then calculated sensitivity, specificity and accuracy.
Results
Surprisingly, we found that all three signatures were reasonably valid predictors of diagnosis, with small differences in test sensitivity, specificity and overall accuracy.
Conclusions
This study represents the first use of BPR on whole lung tissue; previously, BPR was primarily used to develop predictive models for cancer. This also represents the first report of an independently validated IPF gene expression signature. In summary, BPR is a promising tool for the development of gene expression signatures from non-neoplastic lung tissue. In the future, BPR might be used to develop definitive diagnostic gene signatures for IPF, prognostic gene signatures for IPF or gene signatures for other non-neoplastic lung disorders such as bronchiolitis obliterans.
doi:10.1186/1755-8794-4-70
PMCID: PMC3199230  PMID: 21974901
18.  Bridging consent: from toll bridges to lift bridges? 
BMC Medical Genomics  2011;4:69.
Background
The ability to share human biological samples, associated data and results across disease-specific and population-based human research biobanks is becoming increasingly important for research into disease development and translation. Although informed consent often does not anticipate such cross-domain sharing, it is important to examine its plausibility. The purpose of this study was to explore the feasibility of bridging consent between disease-specific and population-based research. Comparative analyses of 1) current ethical and legal frameworks governing consent and 2) informed consent models found in disease-specific and population-based research were conducted.
Discussion
Ethical and legal frameworks governing consent dissuade cross-domain data sharing. Paradoxically, analysis of consent models for disease-specific and population-based research reveals such a high degree of similarity that bridging consent could be possible if additional information regarding bridging was incorporated into consent forms. We submit that bridging of consent could be supported if current trends endorsing a new interpretation of consent are adopted. To illustrate this we sketch potential bridging consent scenarios.
Summary
A bridging consent, respectful of the spirit of initial consent, is feasible and would require only small changes to the content of consents currently being used. Under a bridging consent approach, the initial data and samples collection can serve an identified research project as well as contribute to the creation of a resource for a range of other projects.
doi:10.1186/1755-8794-4-69
PMCID: PMC3206837  PMID: 21970509
19.  Targeted high throughput sequencing in clinical cancer Settings: formaldehyde fixed-paraffin embedded (FFPE) tumor tissues, input amount and tumor heterogeneity 
BMC Medical Genomics  2011;4:68.
Background
Massively parallel sequencing technologies have brought an enormous increase in sequencing throughput. However, these technologies need to be further improved with regard to reproducibility and applicability to clinical samples and settings.
Methods
Using identification of genetic variations in prostate cancer as an example we address three crucial challenges in the field of targeted re-sequencing: Small nucleotide variation (SNV) detection in samples of formalin-fixed paraffin embedded (FFPE) tissue material, minimal amount of input sample and sampling in view of tissue heterogeneity.
Results
We show that FFPE tissue material can supplement for fresh frozen tissues for the detection of SNVs and that solution-based enrichment experiments can be accomplished with small amounts of DNA with only minimal effects on enrichment uniformity and data variance.
Finally, we address the question whether the heterogeneity of a tumor is reflected by different genetic alterations, e.g. different foci of a tumor display different genomic patterns. We show that the tumor heterogeneity plays an important role for the detection of copy number variations.
Conclusions
The application of high throughput sequencing technologies in cancer genomics opens up a new dimension for the identification of disease mechanisms. In particular the ability to use small amounts of FFPE samples available from surgical tumor resections and histopathological examinations facilitates the collection of precious tissue materials. However, care needs to be taken in regard to the locations of the biopsies, which can have an influence on the prediction of copy number variations. Bearing these technological challenges in mind will significantly improve many large-scale sequencing studies and will - in the long term - result in a more reliable prediction of individual cancer therapies.
doi:10.1186/1755-8794-4-68
PMCID: PMC3192667  PMID: 21958464
20.  Gene expression during normal and FSHD myogenesis 
BMC Medical Genomics  2011;4:67.
Background
Facioscapulohumeral muscular dystrophy (FSHD) is a dominant disease linked to contraction of an array of tandem 3.3-kb repeats (D4Z4) at 4q35. Within each repeat unit is a gene, DUX4, that can encode a protein containing two homeodomains. A DUX4 transcript derived from the last repeat unit in a contracted array is associated with pathogenesis but it is unclear how.
Methods
Using exon-based microarrays, the expression profiles of myogenic precursor cells were determined. Both undifferentiated myoblasts and myoblasts differentiated to myotubes derived from FSHD patients and controls were studied after immunocytochemical verification of the quality of the cultures. To further our understanding of FSHD and normal myogenesis, the expression profiles obtained were compared to those of 19 non-muscle cell types analyzed by identical methods.
Results
Many of the ~17,000 examined genes were differentially expressed (> 2-fold, p < 0.01) in control myoblasts or myotubes vs. non-muscle cells (2185 and 3006, respectively) or in FSHD vs. control myoblasts or myotubes (295 and 797, respectively). Surprisingly, despite the morphologically normal differentiation of FSHD myoblasts to myotubes, most of the disease-related dysregulation was seen as dampening of normal myogenesis-specific expression changes, including in genes for muscle structure, mitochondrial function, stress responses, and signal transduction. Other classes of genes, including those encoding extracellular matrix or pro-inflammatory proteins, were upregulated in FSHD myogenic cells independent of an inverse myogenesis association. Importantly, the disease-linked DUX4 RNA isoform was detected by RT-PCR in FSHD myoblast and myotube preparations only at extremely low levels. Unique insights into myogenesis-specific gene expression were also obtained. For example, all four Argonaute genes involved in RNA-silencing were significantly upregulated during normal (but not FSHD) myogenesis relative to non-muscle cell types.
Conclusions
DUX4's pathogenic effect in FSHD may occur transiently at or before the stage of myoblast formation to establish a cascade of gene dysregulation. This contrasts with the current emphasis on toxic effects of experimentally upregulated DUX4 expression at the myoblast or myotube stages. Our model could explain why DUX4's inappropriate expression was barely detectable in myoblasts and myotubes but nonetheless linked to FSHD.
doi:10.1186/1755-8794-4-67
PMCID: PMC3204225  PMID: 21951698
21.  mRNA expression profiles of primary high-grade central osteosarcoma are preserved in cell lines and xenografts 
BMC Medical Genomics  2011;4:66.
Background
Conventional high-grade osteosarcoma is a primary malignant bone tumor, which is most prevalent in adolescence. Survival rates of osteosarcoma patients have not improved significantly in the last 25 years. Aiming to increase this survival rate, a variety of model systems are used to study osteosarcomagenesis and to test new therapeutic agents. Such model systems are typically generated from an osteosarcoma primary tumor, but undergo many changes due to culturing or interactions with a different host species, which may result in differences in gene expression between primary tumor cells, and tumor cells from the model system. We aimed to investigate whether gene expression profiles of osteosarcoma cell lines and xenografts are still comparable to those of the primary tumor.
Methods
We performed genome-wide mRNA expression profiling on osteosarcoma biopsies (n = 76), cell lines (n = 13), and xenografts (n = 18). Osteosarcoma can be subdivided into several histological subtypes, of which osteoblastic, chondroblastic, and fibroblastic osteosarcoma are the most frequent ones. Using nearest shrunken centroids classification, we generated an expression signature that can predict the histological subtype of osteosarcoma biopsies.
Results
The expression signature, which consisted of 24 probes encoding for 22 genes, predicted the histological subtype of osteosarcoma biopsies with a misclassification error of 15%. Histological subtypes of the two osteosarcoma model systems, i.e. osteosarcoma cell lines and xenografts, were predicted with similar misclassification error rates (15% and 11%, respectively).
Conclusions
Based on the preservation of mRNA expression profiles that are characteristic for the histological subtype we propose that these model systems are representative for the primary tumor from which they are derived.
doi:10.1186/1755-8794-4-66
PMCID: PMC3193807  PMID: 21933437
22.  MicroRNA-34a modulates genes involved in cellular motility and oxidative phosphorylation in neural precursors derived from human umbilical cord mesenchymal stem cells 
BMC Medical Genomics  2011;4:65.
Background
Mesenchymal stem cell (MSC) found in bone marrow (BM-MSCs) and the Wharton's jelly matrix of human umbilical cord (WJ-MSCs) are able to transdifferentiate into neuronal lineage cells both in vitro and in vivo and therefore hold the potential to treat neural disorders such as stroke or Parkinson's disease. In bone marrow MSCs, miR-130a and miR-206 have been show to regulate the synthesis of neurotransmitter substance P in human mesenchymal stem cell-derived neuronal cells. However, how neuronal differentiation is controlled in WJ-MSC remains unclear.
Methods
WJ-MSCs were isolated from human umbilical cords. We subjected WJ-MSCs into neurogenesis by a published protocol, and the miRNome patterns of WJ-MSCs and their neuronal progenitors (day 9 after differentiation) were analyzed by the Agilent microRNA microarray.
Results
Five miRNAs were enriched in WJ-MSCs, including miR-345, miR-106a, miR-17-5p, miR-20a and miR-20b. Another 11 miRNAs (miR-206, miR-34a, miR-374, miR-424, miR-100, miR-101, miR-323, miR-368, miR-137, miR-138 and miR-377) were abundantly expressed in transdifferentiated neuronal progenitors. Among these miRNAs, miR-34a and miR-206 were the only 2 miRNAs been linked to BM-MSC neurogenesis. Overexpressing miR-34a in cells suppressed the expression of 136 neuronal progenitor genes, which all possess putative miR-34a binding sites. Gene enrichment analysis according to the Gene Ontology database showed that those 136 genes were associated with cell motility, energy production (including those with oxidative phosphorylation, electron transport and ATP synthesis) and actin cytoskeleton organization, indicating that miR-34a plays a critical role in precursor cell migration. Knocking down endogenous miR-34a expression in WJ-MSCs resulted in the augment of WJ-MSC motility.
Conclusions
Our data suggest a critical role of miRNAs in MSC neuronal differentiation, and miR-34a contributes in neuronal precursor motility, which may be crucial for stem cells to home to the target sites they should be.
doi:10.1186/1755-8794-4-65
PMCID: PMC3195087  PMID: 21923954
23.  Identification of candidate genes linking systemic inflammation to atherosclerosis; results of a human in vivo LPS infusion study 
BMC Medical Genomics  2011;4:64.
Background
It is widely accepted that atherosclerosis and inflammation are intimately linked. Monocytes play a key role in both of these processes and we hypothesized that activation of inflammatory pathways in monocytes would lead to, among others, proatherogenic changes in the monocyte transcriptome. Such differentially expressed genes in circulating monocytes would be strong candidates for further investigation in disease association studies.
Methods
Endotoxin, lipopolysaccharide (LPS), or saline control was infused in healthy volunteers. Monocyte RNA was isolated, processed and hybridized to Hver 2.1.1 spotted cDNA microarrays. Differential expression of key genes was confirmed by RT-PCR and results were compared to in vitro data obtained by our group to identify candidate genes.
Results
All subjects who received LPS experienced the anticipated clinical response indicating successful stimulation. One hour after LPS infusion, 11 genes were identified as being differentially expressed; 1 down regulated and 10 up regulated. Four hours after LPS infusion, 28 genes were identified as being differentially expressed; 3 being down regulated and 25 up regulated. No genes were significantly differentially expressed following saline infusion. Comparison with results obtained in in vitro experiments lead to the identification of 6 strong candidate genes (BATF, BID, C3aR1, IL1RN, SEC61B and SLC43A3)
Conclusion
In vivo endotoxin exposure of healthy individuals resulted in the identification of several candidate genes through which systemic inflammation links to atherosclerosis.
doi:10.1186/1755-8794-4-64
PMCID: PMC3174875  PMID: 21827714
Human; Monocytes; LPS infusion; Transcriptome; In Vivo
24.  Combinations of newly confirmed Glioma-Associated loci link regions on chromosomes 1 and 9 to increased disease risk 
BMC Medical Genomics  2011;4:63.
Background
Glioblastoma multiforme (GBM) tends to occur between the ages of 45 and 70. This relatively early onset and its poor prognosis make the impact of GBM on public health far greater than would be suggested by its relatively low frequency. Tissue and blood samples have now been collected for a number of populations, and predisposing alleles have been sought by several different genome-wide association (GWA) studies. The Cancer Genome Atlas (TCGA) at NIH has also collected a considerable amount of data. Because of the low concordance between the results obtained using different populations, only 14 predisposing single nucleotide polymorphism (SNP) candidates in five genomic regions have been replicated in two or more studies. The purpose of this paper is to present an improved approach to biomarker identification.
Methods
Association analysis was performed with control of population stratifications using the EIGENSTRAT package, under the null hypothesis of "no association between GBM and control SNP genotypes," based on an additive inheritance model. Genes that are strongly correlated with identified SNPs were determined by linkage disequilibrium (LD) or expression quantitative trait locus (eQTL) analysis. A new approach that combines meta-analysis and pathway enrichment analysis identified additional genes.
Results
(i) A meta-analysis of SNP data from TCGA and the Adult Glioma Study identifies 12 predisposing SNP candidates, seven of which are reported for the first time. These SNPs fall in five genomic regions (5p15.33, 9p21.3, 1p21.2, 3q26.2 and 7p15.3), three of which have not been previously reported. (ii) 25 genes are strongly correlated with these 12 SNPs, eight of which are known to be cancer-associated. (iii) The relative risk for GBM is highest for risk allele combinations on chromosomes 1 and 9. (iv) A combined meta-analysis/pathway analysis identified an additional four genes. All of these have been identified as cancer-related, but have not been previously associated with glioma. (v) Some SNPs that do not occur reproducibly across populations are in reproducible (invariant) pathways, suggesting that they affect the same biological process, and that population discordance can be partially resolved by evaluating processes rather than genes.
Conclusion
We have uncovered 29 glioma-associated gene candidates; 12 of them known to be cancer related (p = 1. 4 × 10-6), providing additional statistical support for the relevance of the new candidates. This additional information on risk loci is potentially important for identifying Caucasian individuals at risk for glioma, and for assessing relative risk.
doi:10.1186/1755-8794-4-63
PMCID: PMC3212919  PMID: 21827660
25.  Integrative network analysis identifies key genes and pathways in the progression of hepatitis C virus induced hepatocellular carcinoma 
BMC Medical Genomics  2011;4:62.
Background
Incidence of hepatitis C virus (HCV) induced hepatocellular carcinoma (HCC) has been increasing in the United States and Europe during recent years. Although HCV-associated HCC shares many pathological characteristics with other types of HCC, its molecular mechanisms of progression remain elusive.
Methods
To investigate the underlying pathology, we developed a systematic approach to identify deregulated biological networks in HCC by integrating gene expression profiles with high-throughput protein-protein interaction data. We examined five stages including normal (control) liver, cirrhotic liver, dysplasia, early HCC and advanced HCC.
Results
Among the five consecutive pathological stages, we identified four networks including precancerous networks (Normal-Cirrhosis and Cirrhosis-Dysplasia) and cancerous networks (Dysplasia-Early HCC, Early-Advanced HCC). We found little overlap between precancerous and cancerous networks, opposite to a substantial overlap within precancerous or cancerous networks. We further found that the hub proteins interacted with HCV proteins, suggesting direct interventions of these networks by the virus. The functional annotation of each network demonstrates a high degree of consistency with current knowledge in HCC. By assembling these functions into a module map, we could depict the stepwise biological functions that are deregulated in HCV-induced hepatocarcinogenesis. Additionally, these networks enable us to identify important genes and pathways by developmental stage, such as LCK signalling pathways in cirrhosis, MMP genes and TIMP genes in dysplastic liver, and CDC2-mediated cell cycle signalling in early and advanced HCC. CDC2 (alternative symbol CDK1), a cell cycle regulatory gene, is particularly interesting due to its topological position in temporally deregulated networks.
Conclusions
Our study uncovers a temporal spectrum of functional deregulation and prioritizes key genes and pathways in the progression of HCV induced HCC. These findings present a wealth of information for further investigation.
doi:10.1186/1755-8794-4-62
PMCID: PMC3212927  PMID: 21824427

Results 1-25 (86)