Search tips
Search criteria

Results 1-15 (15)

Clipboard (0)

Select a Filter Below

Year of Publication
1.  RSAT 2015: Regulatory Sequence Analysis Tools 
Nucleic Acids Research  2015;43(Web Server issue):W50-W56.
RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at
PMCID: PMC4489296  PMID: 25904632
2.  Whole-Exome Sequencing and High Throughput Genotyping Identified KCNJ11 as the Thirteenth MODY Gene 
PLoS ONE  2012;7(6):e37423.
Maturity-onset of the young (MODY) is a clinically heterogeneous form of diabetes characterized by an autosomal-dominant mode of inheritance, an onset before the age of 25 years, and a primary defect in the pancreatic beta-cell function. Approximately 30% of MODY families remain genetically unexplained (MODY-X). Here, we aimed to use whole-exome sequencing (WES) in a four-generation MODY-X family to identify a new susceptibility gene for MODY.
WES (Agilent-SureSelect capture/Illumina-GAIIx sequencing) was performed in three affected and one non-affected relatives in the MODY-X family. We then performed a high-throughput multiplex genotyping (Illumina-GoldenGate assay) of the putative causal mutations in the whole family and in 406 controls. A linkage analysis was also carried out.
Principal Findings
By focusing on variants of interest (i.e. gains of stop codon, frameshift, non-synonymous and splice-site variants not reported in dbSNP130) present in the three affected relatives and not present in the control, we found 69 mutations. However, as WES was not uniform between samples, a total of 324 mutations had to be assessed in the whole family and in controls. Only one mutation (p.Glu227Lys in KCNJ11) co-segregated with diabetes in the family (with a LOD-score of 3.68). No KCNJ11 mutation was found in 25 other MODY-X unrelated subjects.
Beyond neonatal diabetes mellitus (NDM), KCNJ11 is also a MODY gene (‘MODY13’), confirming the wide spectrum of diabetes related phenotypes due to mutations in NDM genes (i.e. KCNJ11, ABCC8 and INS). Therefore, the molecular diagnosis of MODY should include KCNJ11 as affected carriers can be ideally treated with oral sulfonylureas.
PMCID: PMC3372463  PMID: 22701567
3.  Correction: Clusters of Conserved Beta Cell Marker Genes for Assessment of Beta Cell Phenotype 
PLoS ONE  2012;7(1):10.1371/annotation/a91571a6-acbb-456f-bcc5-f4a431e28516.
PMCID: PMC3267642
4.  Correction: Clusters of Conserved Beta Cell Marker Genes for Assessment of Beta Cell Phenotype 
PLoS ONE  2012;7(1):10.1371/annotation/4aae21a9-e176-4feb-9f15-103b265d3335.
PMCID: PMC3267643
5.  Correction: Clusters of Conserved Beta Cell Marker Genes for Assessment of Beta Cell Phenotype 
PLoS ONE  2012;7(1):10.1371/annotation/7aa0ff33-5660-4b56-889a-4b86a273d522.
PMCID: PMC3267644
6.  Correction: Clusters of Conserved Beta Cell Marker Genes for Assessment of Beta Cell Phenotype 
PLoS ONE  2012;7(1):10.1371/annotation/13c6d084-a8fd-4019-a3cf-12a0d8abe309.
PMCID: PMC3267645
7.  RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets 
Nucleic Acids Research  2011;40(4):e31.
ChIP-seq is increasingly used to characterize transcription factor binding and chromatin marks at a genomic scale. Various tools are now available to extract binding motifs from peak data sets. However, most approaches are only available as command-line programs, or via a website but with size restrictions. We present peak-motifs, a computational pipeline that discovers motifs in peak sequences, compares them with databases, exports putative binding sites for visualization in the UCSC genome browser and generates an extensive report suited for both naive and expert users. It relies on time- and memory-efficient algorithms enabling the treatment of several thousand peaks within minutes. Regarding time efficiency, peak-motifs outperforms all comparable tools by several orders of magnitude. We demonstrate its accuracy by analyzing data sets ranging from 4000 to 1 28 000 peaks for 12 embryonic stem cell-specific transcription factors. In all cases, the program finds the expected motifs and returns additional motifs potentially bound by cofactors. We further apply peak-motifs to discover tissue-specific motifs in peak collections for the p300 transcriptional co-activator. To our knowledge, peak-motifs is the only tool that performs a complete motif analysis and offers a user-friendly web interface without any restriction on sequence size or number of peaks.
PMCID: PMC3287167  PMID: 22156162
8.  Clusters of Conserved Beta Cell Marker Genes for Assessment of Beta Cell Phenotype 
PLoS ONE  2011;6(9):e24134.
Background and Methodology
The aim of this study was to establish a gene expression blueprint of pancreatic beta cells conserved from rodents to humans and to evaluate its applicability to assess shifts in the beta cell differentiated state. Genome-wide mRNA expression profiles of isolated beta cells were compared to those of a large panel of other tissue and cell types, and transcripts with beta cell-abundant and -selective expression were identified. Iteration of this analysis in mouse, rat and human tissues generated a panel of conserved beta cell biomarkers. This panel was then used to compare isolated versus laser capture microdissected beta cells, monitor adaptations of the beta cell phenotype to fasting, and retrieve possible conserved transcriptional regulators.
Principal Findings
A panel of 332 conserved beta cell biomarker genes was found to discriminate both isolated and laser capture microdissected beta cells from all other examined cell types. Of all conserved beta cell-markers, 15% were strongly beta cell-selective and functionally associated to hormone processing, 15% were shared with neuronal cells and associated to regulated synaptic vesicle transport and 30% with immune plus gut mucosal tissues reflecting active protein synthesis. Fasting specifically down-regulated the latter cluster, but preserved the neuronal and strongly beta cell-selective traits, indicating preserved differentiated state. Analysis of consensus binding site enrichment indicated major roles of CREB/ATF and various nutrient- or redox-regulated transcription factors in maintenance of differentiated beta cell phenotype.
Conserved beta cell marker genes contain major gene clusters defined by their beta cell selectivity or by their additional abundance in either neural cells or in immune plus gut mucosal cells. This panel can be used as a template to identify changes in the differentiated state of beta cells.
PMCID: PMC3166300  PMID: 21912665
9.  RSAT 2011: regulatory sequence analysis tools 
Nucleic Acids Research  2011;39(Web Server issue):W86-W91.
RSAT (Regulatory Sequence Analysis Tools) comprises a wide collection of modular tools for the detection of cis-regulatory elements in genome sequences. Thirteen new programs have been added to the 30 described in the 2008 NAR Web Software Issue, including an automated sequence retrieval from EnsEMBL (retrieve-ensembl-seq), two novel motif discovery algorithms (oligo-diff and info-gibbs), a 100-times faster version of matrix-scan enabling the scanning of genome-scale sequence sets, and a series of facilities for random model generation and statistical evaluation (random-genome-fragments, random-motifs, random-sites, implant-sites, sequence-probability, permute-matrix). Our most recent work also focused on motif comparison (compare-matrices) and evaluation of motif quality (matrix-quality) by combining theoretical and empirical measures to assess the predictive capability of position-specific scoring matrices. To process large collections of peak sequences obtained from ChIP-seq or related technologies, RSAT provides a new program (peak-motifs) that combines several efficient motif discovery algorithms to predict transcription factor binding motifs, match them against motif databases and predict their binding sites. Availability (web site, stand-alone programs and SOAP/WSDL (Simple Object Access Protocol/Web Services Description Language) web services):
PMCID: PMC3125777  PMID: 21715389
10.  Molecular Diagnosis of Neonatal Diabetes Mellitus Using Next-Generation Sequencing of the Whole Exome 
PLoS ONE  2010;5(10):e13630.
Accurate molecular diagnosis of monogenic non-autoimmune neonatal diabetes mellitus (NDM) is critical for patient care, as patients carrying a mutation in KCNJ11 or ABCC8 can be treated by oral sulfonylurea drugs instead of insulin therapy. This diagnosis is currently based on Sanger sequencing of at least 42 PCR fragments from the KCNJ11, ABCC8, and INS genes. Here, we assessed the feasibility of using the next-generation whole exome sequencing (WES) for the NDM molecular diagnosis.
Methodology/Principal Findings
We carried out WES for a patient presenting with permanent NDM, for whom mutations in KCNJ11, ABCC8 and INS and abnormalities in chromosome 6q24 had been previously excluded. A solution hybridization selection was performed to generate WES in 76 bp paired-end reads, by using two channels of the sequencing instrument. WES quality was assessed using a high-resolution oligonucleotide whole-genome genotyping array. From our WES with high-quality reads, we identified a novel non-synonymous mutation in ABCC8 (c.1455G>C/p.Q485H), despite a previous negative sequencing of this gene. This mutation, confirmed by Sanger sequencing, was not present in 348 controls and in the patient's mother, father and young brother, all of whom are normoglycemic.
WES identified a novel de novo ABCC8 mutation in a NDM patient. Compared to the current Sanger protocol, WES is a comprehensive, cost-efficient and rapid method to identify mutations in NDM patients. We suggest WES as a near future tool of choice for further molecular diagnosis of NDM cases, negative for chr6q24, KCNJ11 and INS abnormalities.
PMCID: PMC2964316  PMID: 21049026
11.  The EMBRACE web service collection 
Nucleic Acids Research  2010;38(Web Server issue):W683-W688.
The EMBRACE (European Model for Bioinformatics Research and Community Education) web service collection is the culmination of a 5-year project that set out to investigate issues involved in developing and deploying web services for use in the life sciences. The project concluded that in order for web services to achieve widespread adoption, standards must be defined for the choice of web service technology, for semantically annotating both service function and the data exchanged, and a mechanism for discovering services must be provided. Building on this, the project developed: EDAM, an ontology for describing life science web services; BioXSD, a schema for exchanging data between services; and a centralized registry ( that collects together around 1000 services developed by the consortium partners. This article presents the current status of the collection and its associated recommendations and standards definitions.
PMCID: PMC2896104  PMID: 20462862
12.  Integrating sequence, evolution and functional genomics in regulatory genomics 
Genome Biology  2009;10(1):202.
Finding transcription factor binding sites in regulatory regions of the genome
With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome.
PMCID: PMC2687781  PMID: 19226437
13.  NeAT: a toolbox for the analysis of biological networks, clusters, classes and pathways 
Nucleic Acids Research  2008;36(Web Server issue):W444-W451.
The network analysis tools (NeAT) ( provide a user-friendly web access to a collection of modular tools for the analysis of networks (graphs) and clusters (e.g. microarray clusters, functional classes, etc.). A first set of tools supports basic operations on graphs (comparison between two graphs, neighborhood of a set of input nodes, path finding and graph randomization). Another set of programs makes the connection between networks and clusters (graph-based clustering, cliques discovery and mapping of clusters onto a network). The toolbox also includes programs for detecting significant intersections between clusters/classes (e.g. clusters of co-expression versus functional classes of genes). NeAT are designed to cope with large datasets and provide a flexible toolbox for analyzing biological networks stored in various databases (protein interactions, regulation and metabolism) or obtained from high-throughput experiments (two-hybrid, mass-spectrometry and microarrays). The web interface interconnects the programs in predefined analysis flows, enabling to address a series of questions about networks of interest. Each tool can also be used separately by entering custom data for a specific analysis. NeAT can also be used as web services (SOAP/WSDL interface), in order to design programmatic workflows and integrate them with other available resources.
PMCID: PMC2447721  PMID: 18524799
14.  RSAT: regulatory sequence analysis tools 
Nucleic Acids Research  2008;36(Web Server issue):W119-W127.
The regulatory sequence analysis tools (RSAT, is a software suite that integrates a wide collection of modular tools for the detection of cis-regulatory elements in genome sequences. The suite includes programs for sequence retrieval, pattern discovery, phylogenetic footprint detection, pattern matching, genome scanning and feature map drawing. Random controls can be performed with random gene selections or by generating random sequences according to a variety of background models (Bernoulli, Markov). Beyond the original word-based pattern-discovery tools (oligo-analysis and dyad-analysis), we recently added a battery of tools for matrix-based detection of cis-acting elements, with some original features (adaptive background models, Markov-chain estimation of P-values) that do not exist in other matrix-based scanning tools. The web server offers an intuitive interface, where each program can be accessed either separately or connected to the other tools. In addition, the tools are now available as web services, enabling their integration in programmatic workflows. Genomes are regularly updated from various genome repositories (NCBI and EnsEMBL) and 682 organisms are currently supported. Since 1998, the tools have been used by several hundreds of researchers from all over the world. Several predictions made with RSAT were validated experimentally and published.
PMCID: PMC2447775  PMID: 18495751
15.  Fine-Tuning Enhancer Models to Predict Transcriptional Targets across Multiple Genomes 
PLoS ONE  2007;2(11):e1115.
Networks of regulatory relations between transcription factors (TF) and their target genes (TG)- implemented through TF binding sites (TFBS)- are key features of biology. An idealized approach to solving such networks consists of starting from a consensus TFBS or a position weight matrix (PWM) to generate a high accuracy list of candidate TGs for biological validation. Developing and evaluating such approaches remains a formidable challenge in regulatory bioinformatics. We perform a benchmark study on 34 Drosophila TFs to assess existing TFBS and cis-regulatory module (CRM) detection methods, with a strong focus on the use of multiple genomes. Particularly, for CRM-modelling we investigate the addition of orthologous sites to a known PWM to construct phyloPWMs and we assess the added value of phylogenentic footprinting to predict contextual motifs around known TFBSs. For CRM-prediction, we compare motif conservation with network-level conservation approaches across multiple genomes. Choosing the optimal training and scoring strategies strongly enhances the performance of TG prediction for more than half of the tested TFs. Finally, we analyse a 35th TF, namely Eyeless, and find a significant overlap between predicted TGs and candidate TGs identified by microarray expression studies. In summary we identify several ways to optimize TF-specific TG predictions, some of which can be applied to all TFs, and others that can be applied only to particular TFs. The ability to model known TF-TG relations, together with the use of multiple genomes, results in a significant step forward in solving the architecture of gene regulatory networks.
PMCID: PMC2047340  PMID: 17973026

Results 1-15 (15)