Search tips
Search criteria

Results 1-11 (11)

Clipboard (0)

Select a Filter Below

Year of Publication
1.  A proteome-scale map of the human interactome network 
Cell  2014;159(5):1212-1226.
Just as reference genome sequences revolutionized human genetics, reference maps of interactome networks will be critical to fully understand genotype-phenotype relationships. Here, we describe a systematic map of ~14,000 high-quality human binary protein-protein interactions. At equal quality, this map is ~30% larger than what is available from small-scale studies published in the literature in the last few decades. While currently available information is highly biased and only covers a relatively small portion of the proteome, our systematic map appears strikingly more homogeneous, revealing a “broader” human interactome network than currently appreciated. The map also uncovers significant inter-connectivity between known and candidate cancer gene products, providing unbiased evidence for an expanded functional cancer landscape, while demonstrating how high quality interactome models will help “connect the dots” of the genomic revolution.
PMCID: PMC4266588  PMID: 25416956
2.  Analyse multiple disease subtypes and build associated gene networks using genome-wide expression profiles 
BMC Genomics  2015;16(Suppl 5):S3.
Despite the large increase of transcriptomic studies that look for gene signatures on diseases, there is still a need for integrative approaches that obtain separation of multiple pathological states providing robust selection of gene markers for each disease subtype and information about the possible links or relations between those genes.
We present a network-oriented and data-driven bioinformatic approach that searches for association of genes and diseases based on the analysis of genome-wide expression data derived from microarrays or RNA-Seq studies. The approach aims to (i) identify gene sets associated to different pathological states analysed together; (ii) identify a minimum subset within these genes that unequivocally differentiates and classifies the compared disease subtypes; (iii) provide a measurement of the discriminant power of these genes and (iv) identify links between the genes that characterise each of the disease subtypes. This bioinformatic approach is implemented in an R package, named geNetClassifier, available as an open access tool in Bioconductor. To illustrate the performance of the tool, we applied it to two independent datasets: 250 samples from patients with four major leukemia subtypes analysed using expression arrays; another leukemia dataset analysed with RNA-Seq that includes a subtype also present in the previous set. The results show the selection of key deregulated genes recently reported in the literature and assigned to the leukemia subtypes studied. We also show, using these independent datasets, the selection of similar genes in a network built for the same disease subtype.
The construction of gene networks related to specific disease subtypes that include parameters such as gene-to-gene association, gene disease specificity and gene discriminant power can be very useful to draw gene-disease maps and to unravel the molecular features that characterize specific pathological states. The application of the bioinformatic tool here presented shows a neat way to achieve such molecular characterization of the diseases using genome-wide expression data.
PMCID: PMC4460584  PMID: 26040557
gene; expression; expression profile; gene networks; microarray; RNA-Seq; disease; disease classification; cancer; leukemia; acute leukemia
3.  Deregulation of Genes Related to Iron and Mitochondrial Metabolism in Refractory Anemia with Ring Sideroblasts 
PLoS ONE  2015;10(5):e0126555.
The presence of SF3B1 gene mutations is a hallmark of refractory anemia with ring sideroblasts (RARS). However, the mechanisms responsible for iron accumulation that characterize the Myelodysplastic Syndrome with ring sideroblasts (MDS-RS) are not completely understood. In order to gain insight in the molecular basis of MDS-RS, an integrative study of the expression and mutational status of genes related to iron and mitochondrial metabolism was carried out. A total of 231 low-risk MDS patients and 81 controls were studied. Gene expression analysis revealed that iron metabolism and mitochondrial function had the highest number of genes deregulated in RARS patients compared to controls and the refractory cytopenias with unilineage dysplasia (RCUD). Thus mitochondrial transporters SLC25 (SLC25A37 and SLC25A38) and ALAD genes were over-expressed in RARS. Moreover, significant differences were observed between patients with SF3B1 mutations and patients without the mutations. The deregulation of genes involved in iron and mitochondrial metabolism provides new insights in our knowledge of MDS-RS. New variants that could be involved in the pathogenesis of these diseases have been identified.
PMCID: PMC4425562  PMID: 25955609
4.  Functional Gene Networks: R/Bioc package to generate and analyse gene networks derived from functional enrichment and clustering 
Bioinformatics  2015;31(10):1686-1688.
Summary: Functional Gene Networks (FGNet) is an R/Bioconductor package that generates gene networks derived from the results of functional enrichment analysis (FEA) and annotation clustering. The sets of genes enriched with specific biological terms (obtained from a FEA platform) are transformed into a network by establishing links between genes based on common functional annotations and common clusters. The network provides a new view of FEA results revealing gene modules with similar functions and genes that are related to multiple functions. In addition to building the functional network, FGNet analyses the similarity between the groups of genes and provides a distance heatmap and a bipartite network of functionally overlapping genes. The application includes an interface to directly perform FEA queries using different external tools: DAVID, GeneTerm Linker, TopGO or GAGE; and a graphical interface to facilitate the use.
Availability and implementation: FGNet is available in Bioconductor, including a tutorial. URL:
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC4426835  PMID: 25600944
5.  Combined analysis of genome-wide expression and copy number profiles to identify key altered genomic regions in cancer 
BMC Genomics  2012;13(Suppl 5):S5.
Analysis of DNA copy number alterations and gene expression changes in human samples have been used to find potential target genes in complex diseases. Recent studies have combined these two types of data using different strategies, but focusing on finding gene-based relationships. However, it has been proposed that these data can be used to identify key genomic regions, which may enclose causal genes under the assumption that disease-associated gene expression changes are caused by genomic alterations.
Following this proposal, we undertake a new integrative analysis of genome-wide expression and copy number datasets. The analysis is based on the combined location of both types of signals along the genome. Our approach takes into account the genomic location in the copy number (CN) analysis and also in the gene expression (GE) analysis. To achieve this we apply a segmentation algorithm to both types of data using paired samples. Then, we perform a correlation analysis and a frequency analysis of the gene loci in the segmented CN regions and the segmented GE regions; selecting in both cases the statistically significant loci. In this way, we find CN alterations that show strong correspondence with GE changes. We applied our method to a human dataset of 64 Glioblastoma Multiforme samples finding key loci and hotspots that correspond to major alterations previously described for this type of tumors.
Identification of key altered genomic loci constitutes a first step to find the genes that drive the alteration in a malignant state. These driver genes can be found in regions that show high correlation in copy number alterations and expression changes.
PMCID: PMC3476997  PMID: 23095915
6.  Prognostic Impact of del(17p) and del(22q) as Assessed by Interphase FISH in Sporadic Colorectal Carcinomas 
PLoS ONE  2012;7(8):e42683.
Most sporadic colorectal cancer (sCRC) deaths are caused by metastatic dissemination of the primary tumor. New advances in genetic profiling of sCRC suggest that the primary tumor may contain a cell population with metastatic potential. Here we compare the cytogenetic profile of primary tumors from liver metastatic versus non-metastatic sCRC.
Methodology/Principal Findings
We prospectively analyzed the frequency of numerical/structural abnormalities of chromosomes 1, 7, 8, 13, 14, 17, 18, 20, and 22 by iFISH in 58 sCRC patients: thirty-one non-metastatic (54%) vs. 27 metastatic (46%) disease. From a total of 18 probes, significant differences emerged only for the 17p11.2 and 22q11.2 chromosomal regions. Patients with liver metastatic sCRC showed an increased frequency of del(17p11.2) (10% vs. 67%;p<.001) and del(22q11.2) (0% vs. 22%;p = .02) versusnon-metastatic cases. Multivariate analysis of prognostic factors for overall survival (OS) showed that the only clinical and cytogenetic parameters that had an independent adverse impact on patient outcome were the presence of del(17p) with a 17p11.2 breakpoint and del(22q11.2). Based on these two cytogenetic variables, patients were classified into three groups: low- (no adverse features), intermediate- (one adverse feature) and high-risk (two adverse features)- with significantly different OS rates at 5-years (p<.001): 92%, 53% and 0%, respectively.
Our results unravel the potential implication of del(17p11.2) in sCRC patients with liver metastasis as this cytogenetic alteration appears to be intrinsically related to an increased metastatic potential and a poor outcome, providing additional prognostic information to that associated with other cytogenetic alterations such as del(22q11.2). Additional prospective studies in larger series of patients would be required to confirm the clinical utility of the new prognostic markers identified.
PMCID: PMC3422354  PMID: 22912721
7.  Functional Analysis beyond Enrichment: Non-Redundant Reciprocal Linkage of Genes and Biological Terms 
PLoS ONE  2011;6(9):e24289.
Functional analysis of large sets of genes and proteins is becoming more and more necessary with the increase of experimental biomolecular data at omic-scale. Enrichment analysis is by far the most popular available methodology to derive functional implications of sets of cooperating genes. The problem with these techniques relies in the redundancy of resulting information, that in most cases generate lots of trivial results with high risk to mask the reality of key biological events. We present and describe a computational method, called GeneTerm Linker, that filters and links enriched output data identifying sets of associated genes and terms, producing metagroups of coherent biological significance. The method uses fuzzy reciprocal linkage between genes and terms to unravel their functional convergence and associations. The algorithm is tested with a small set of well known interacting proteins from yeast and with a large collection of reference sets from three heterogeneous resources: multiprotein complexes (CORUM), cellular pathways (SGD) and human diseases (OMIM). Statistical Precision, Recall and balanced F-score are calculated showing robust results, even when different levels of random noise are included in the test sets. Although we could not find an equivalent method, we present a comparative analysis with a widely used method that combines enrichment and functional annotation clustering. A web application to use the method here proposed is provided at
PMCID: PMC3174934  PMID: 21949701
8.  Mapping of Genetic Abnormalities of Primary Tumours from Metastatic CRC by High-Resolution SNP Arrays 
PLoS ONE  2010;5(10):e13752.
For years, the genetics of metastatic colorectal cancer (CRC) have been studied using a variety of techniques. However, most of the approaches employed so far have a relatively limited resolution which hampers detailed characterization of the common recurrent chromosomal breakpoints as well as the identification of small regions carrying genetic changes and the genes involved in them.
Methodology/Principal Findings
Here we applied 500K SNP arrays to map the most common chromosomal lesions present at diagnosis in a series of 23 primary tumours from sporadic CRC patients who had developed liver metastasis. Overall our results confirm that the genetic profile of metastatic CRC is defined by imbalanced gains of chromosomes 7, 8q, 11q, 13q, 20q and X together with losses of the 1p, 8p, 17p and 18q chromosome regions. In addition, SNP-array studies allowed the identification of small (<1.3 Mb) and extensive/large (>1.5 Mb) altered DNA sequences, many of which contain cancer genes known to be involved in CRC and the metastatic process. Detailed characterization of the breakpoint regions for the altered chromosomes showed four recurrent breakpoints at chromosomes 1p12, 8p12, 17p11.2 and 20p12.1; interestingly, the most frequently observed recurrent chromosomal breakpoint was localized at 17p11.2 and systematically targeted the FAM27L gene, whose role in CRC deserves further investigations.
In summary, in the present study we provide a detailed map of the genetic abnormalities of primary tumours from metastatic CRC patients, which confirm and extend on previous observations as regards the identification of genes potentially involved in development of CRC and the metastatic process.
PMCID: PMC2966422  PMID: 21060790
10.  GATExplorer: Genomic and Transcriptomic Explorer; mapping expression probes to gene loci, transcripts, exons and ncRNAs 
BMC Bioinformatics  2010;11:221.
Genome-wide expression studies have developed exponentially in recent years as a result of extensive use of microarray technology. However, expression signals are typically calculated using the assignment of "probesets" to genes, without addressing the problem of "gene" definition or proper consideration of the location of the measuring probes in the context of the currently known genomes and transcriptomes. Moreover, as our knowledge of metazoan genomes improves, the number of both protein-coding and noncoding genes, as well as their associated isoforms, continues to increase. Consequently, there is a need for new databases that combine genomic and transcriptomic information and provide updated mapping of expression probes to current genomic annotations.
GATExplorer (Genomic and Transcriptomic Explorer) is a database and web platform that integrates a gene loci browser with nucleotide level mappings of oligo probes from expression microarrays. It allows interactive exploration of gene loci, transcripts and exons of human, mouse and rat genomes, and shows the specific location of all mappable Affymetrix microarray probes and their respective expression levels in a broad set of biological samples. The web site allows visualization of probes in their genomic context together with any associated protein-coding or noncoding transcripts. In the case of all-exon arrays, this provides a means by which the expression of the individual exons within a gene can be compared, thereby facilitating the identification and analysis of alternatively spliced exons. The application integrates data from four major source databases: Ensembl, RNAdb, Affymetrix and GeneAtlas; and it provides the users with a series of files and packages (R CDFs) to analyze particular query expression datasets. The maps cover both the widely used Affymetrix GeneChip microarrays based on 3' expression (e.g. human HG U133 series) and the all-exon expression microarrays (Gene 1.0 and Exon 1.0).
GATExplorer is an integrated database that combines genomic/transcriptomic visualization with nucleotide-level probe mapping. By considering expression at the nucleotide level rather than the gene level, it shows that the arrays detect expression signals from entities that most researchers do not contemplate or discriminate. This approach provides the means to undertake a higher resolution analysis of microarray data and potentially extract considerably more detailed and biologically accurate information from existing and future microarray experiments.
PMCID: PMC2875241  PMID: 20429936
11.  Human Gene Coexpression Landscape: Confident Network Derived from Tissue Transcriptomic Profiles 
PLoS ONE  2008;3(12):e3911.
Analysis of gene expression data using genome-wide microarrays is a technique often used in genomic studies to find coexpression patterns and locate groups of co-transcribed genes. However, most studies done at global “omic” scale are not focused on human samples and when they correspond to human very often include heterogeneous datasets, mixing normal with disease-altered samples. Moreover, the technical noise present in genome-wide expression microarrays is another well reported problem that many times is not addressed with robust statistical methods, and the estimation of errors in the data is not provided.
Methodology/Principal Findings
Human genome-wide expression data from a controlled set of normal-healthy tissues is used to build a confident human gene coexpression network avoiding both pathological and technical noise. To achieve this we describe a new method that combines several statistical and computational strategies: robust normalization and expression signal calculation; correlation coefficients obtained by parametric and non-parametric methods; random cross-validations; and estimation of the statistical accuracy and coverage of the data. All these methods provide a series of coexpression datasets where the level of error is measured and can be tuned. To define the errors, the rates of true positives are calculated by assignment to biological pathways. The results provide a confident human gene coexpression network that includes 3327 gene-nodes and 15841 coexpression-links and a comparative analysis shows good improvement over previously published datasets. Further functional analysis of a subset core network, validated by two independent methods, shows coherent biological modules that share common transcription factors. The network reveals a map of coexpression clusters organized in well defined functional constellations. Two major regions in this network correspond to genes involved in nuclear and mitochondrial metabolism and investigations on their functional assignment indicate that more than 60% are house-keeping and essential genes. The network displays new non-described gene associations and it allows the placement in a functional context of some unknown non-assigned genes based on their interactions with known gene families.
The identification of stable and reliable human gene to gene coexpression networks is essential to unravel the interactions and functional correlations between human genes at an omic scale. This work contributes to this aim, and we are making available for the scientific community the validated human gene coexpression networks obtained, to allow further analyses on the network or on some specific gene associations.
The data are available free online at
PMCID: PMC2597745  PMID: 19081792

Results 1-11 (11)