Search tips
Search criteria

Results 1-25 (27)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
author:("stupa, Elia")
1.  ARNT2 mutation causes hypopituitarism, post-natal microcephaly, visual and renal anomalies 
Brain  2013;136(10):3096-3105.
We describe a previously unreported syndrome characterized by secondary (post-natal) microcephaly with fronto-temporal lobe hypoplasia, multiple pituitary hormone deficiency, seizures, severe visual impairment and abnormalities of the kidneys and urinary tract in a highly consanguineous family with six affected children. Homozygosity mapping and exome sequencing revealed a novel homozygous frameshift mutation in the basic helix-loop-helix transcription factor gene ARNT2 (c.1373_1374dupTC) in affected individuals. This mutation results in absence of detectable levels of ARNT2 transcript and protein from patient fibroblasts compared with controls, consistent with nonsense-mediated decay of the mutant transcript and loss of ARNT2 function. We also show expression of ARNT2 within the central nervous system, including the hypothalamus, as well as the renal tract during human embryonic development. The progressive neurological abnormalities, congenital hypopituitarism and post-retinal visual pathway dysfunction in affected individuals demonstrates for the first time the essential role of ARNT2 in the development of the hypothalamo-pituitary axis, post-natal brain growth, and visual and renal function in humans.
PMCID: PMC3784281  PMID: 24022475
hypothalamus; congenital blindness; brain development; molecular genetics; malformations of cortical development
2.  VISPA: a computational pipeline for the identification and analysis of genomic vector integration sites 
Genome Medicine  2014;6(9):67.
The analysis of the genomic distribution of viral vector genomic integration sites is a key step in hematopoietic stem cell-based gene therapy applications, allowing to assess both the safety and the efficacy of the treatment and to study the basic aspects of hematopoiesis and stem cell biology. Identifying vector integration sites requires ad-hoc bioinformatics tools with stringent requirements in terms of computational efficiency, flexibility, and usability. We developed VISPA (Vector Integration Site Parallel Analysis), a pipeline for automated integration site identification and annotation based on a distributed environment with a simple Galaxy web interface. VISPA was successfully used for the bioinformatics analysis of the follow-up of two lentiviral vector-based hematopoietic stem-cell gene therapy clinical trials. Our pipeline provides a reliable and efficient tool to assess the safety and efficacy of integrating vectors in clinical settings.
Electronic supplementary material
The online version of this article (doi:10.1186/s13073-014-0067-5) contains supplementary material, which is available to authorized users.
PMCID: PMC4169225  PMID: 25342980
3.  Integrated genomic analysis identifies recurrent mutations and evolution patterns driving the initiation and progression of follicular lymphoma 
Nature genetics  2013;46(2):176-181.
Follicular lymphoma (FL) is an incurable malignancy1, with transformation to an aggressive subtype being a critical event during disease progression. Here we performed whole genome or exome sequencing on 10 FL-transformed FL pairs, followed by deep sequencing of 28 genes in an extension cohort and report the key events and evolutionary processes governing initiation and transformation. Tumor evolution occurred through either a ‘rich’ or ‘sparse’ ancestral common progenitor clone (CPC). We identified recurrent mutations in linker histones, JAK-STAT signaling, NF-κB signaling and B-cell development genes. Longitudinal analyses revealed chromatin regulators (CREBBP, EZH2 and MLL2) as early driver genes, whilst mutations in EBF1 and regulators of NF-κB signaling (MYD88 and TNFAIP3) were gained at transformation. Collectively, this study provides novel insights into the genetic basis of follicular lymphoma, the clonal dynamics of transformation and suggests that personalizing therapies to target key genetic alterations within the CPC represents an attractive therapeutic strategy.
PMCID: PMC3907271  PMID: 24362818
4.  The combination of transcriptomics and informatics identifies pathways targeted by miR-204 during neurogenesis and axon guidance 
Nucleic Acids Research  2014;42(12):7793-7806.
Vertebrate organogenesis is critically sensitive to gene dosage and even subtle variations in the expression levels of key genes may result in a variety of tissue anomalies. MicroRNAs (miRNAs) are fundamental regulators of gene expression and their role in vertebrate tissue patterning is just beginning to be elucidated. To gain further insight into this issue, we analysed the transcriptomic consequences of manipulating the expression of miR-204 in the Medaka fish model system. We used RNA-Seq and an innovative bioinformatics approach, which combines conventional differential expression analysis with the behavior expected by miR-204 targets after its overexpression and knockdown. With this approach combined with a correlative analysis of the putative targets, we identified a wider set of miR-204 target genes belonging to different pathways. Together, these approaches confirmed that miR-204 has a key role in eye development and further highlighted its putative function in neural differentiation processes, including axon guidance as supported by in vivo functional studies. Together, our results demonstrate the advantage of integrating next-generation sequencing and bioinformatics approaches to investigate miRNA biology and provide new important information on the role of miRNAs in the control of axon guidance and more broadly in nervous system development.
PMCID: PMC4081098  PMID: 24895435
5.  Genome-wide signatures of convergent evolution in echolocating mammals 
Nature  2013;502(7470):10.1038/nature12511.
Evolution is typically thought to proceed through divergence of genes, proteins, and ultimately phenotypes1-3. However, similar traits might also evolve convergently in unrelated taxa due to similar selection pressures4,5. Adaptive phenotypic convergence is widespread in nature, and recent results from a handful of genes have suggested that this phenomenon is powerful enough to also drive recurrent evolution at the sequence level6-9. Where homoplasious substitutions do occur these have long been considered the result of neutral processes. However, recent studies have demonstrated that adaptive convergent sequence evolution can be detected in vertebrates using statistical methods that model parallel evolution9,10 although the extent to which sequence convergence between genera occurs across genomes is unknown. Here we analyse genomic sequence data in mammals that have independently evolved echolocation and show for the first time that convergence is not a rare process restricted to a handful of loci but is instead widespread, continuously distributed and commonly driven by natural selection acting on a small number of sites per locus. Systematic analyses of convergent sequence evolution in 805,053 amino acids within 2,326 orthologous coding gene sequences compared across 22 mammals (including four new bat genomes) revealed signatures consistent with convergence in nearly 200 loci. Strong and significant support for convergence among bats and the dolphin was seen in numerous genes linked to hearing or deafness, consistent with an involvement in echolocation. Surprisingly we also found convergence in many genes linked to vision: the convergent signal of many sensory genes was robustly correlated with the strength of natural selection. This first attempt to detect genome-wide convergent sequence evolution across divergent taxa reveals the phenomenon to be much more pervasive than previously recognised.
PMCID: PMC3836225  PMID: 24005325
6.  Genome Wide Identification of Aberrant Alternative Splicing Events in Myotonic Dystrophy Type 2 
PLoS ONE  2014;9(4):e93983.
Myotonic dystrophy type 2 (DM2) is a genetic, autosomal dominant disease due to expansion of tetraplet (CCTG) repetitions in the first intron of the ZNF9/CNBP gene. DM2 is a multisystemic disorder affecting the skeletal muscle, the heart, the eye and the endocrine system. According to the proposed pathological mechanism, the expanded tetraplets have an RNA toxic effect, disrupting the splicing of many mRNAs. Thus, the identification of aberrantly spliced transcripts is instrumental for our understanding of the molecular mechanisms underpinning the disease. The aim of this study was the identification of new aberrant alternative splicing events in DM2 patients. By genome wide analysis of 10 DM2 patients and 10 controls (CTR), we identified 273 alternative spliced exons in 218 genes. While many aberrant splicing events were already identified in the past, most were new. A subset of these events was validated by qPCR assays in 19 DM2 and 15 CTR subjects. To gain insight into the molecular pathways involving the identified aberrantly spliced genes, we performed a bioinformatics analysis with Ingenuity system. This analysis indicated a deregulation of development, cell survival, metabolism, calcium signaling and contractility. In conclusion, our genome wide analysis provided a database of aberrant splicing events in the skeletal muscle of DM2 patients. The affected genes are involved in numerous pathways and networks important for muscle physio-pathology, suggesting that the identified variants may contribute to DM2 pathogenesis.
PMCID: PMC3983107  PMID: 24722564
7.  Dissecting the signaling pathways associated with the oncogenic activity of MLK3 P252H mutation 
BMC Cancer  2014;14:182.
MLK3 gene mutations were described to occur in about 20% of microsatellite unstable gastrointestinal cancers and to harbor oncogenic activity. In particular, mutation P252H, located in the kinase domain, was found to have a strong transforming potential, and to promote the growth of highly invasive tumors when subcutaneously injected in nude mice. Nevertheless, the molecular mechanism underlying the oncogenic activity of P252H mutant remained elusive.
In this work, we performed Illumina Whole Genome arrays on three biological replicas of human HEK293 cells stably transfected with the wild-type MLK3, the P252H mutation and with the empty vector (Mock) in order to identify the putative signaling pathways associated with P252H mutation.
Our microarray results showed that mutant MLK3 deregulates several important colorectal cancer- associated signaling pathways such as WNT, MAPK, NOTCH, TGF-beta and p53, helping to narrow down the number of potential MLK3 targets responsible for its oncogenic effects. A more detailed analysis of the alterations affecting the WNT signaling pathway revealed a down-regulation of molecules involved in the canonical pathway, such as DVL2, LEF1, CCND1 and c-Myc, and an up-regulation of DKK, a well-known negative regulator of canonical WNT signaling, in MLK3 mutant cells. Additionally, FZD6 and FZD10 genes, known to act as negative regulators of the canonical WNT signaling cascade and as positive regulators of the planar cell polarity (PCP) pathway, a non-canonic WNT pathway, were found to be up-regulated in P252H cells.
The results provide an overall view of the expression profile associated with mutant MLK3, and they support the functional role of mutant MLK3 by showing a deregulation of several signaling pathways known to play important roles in the development and progression of colorectal cancer. The results also suggest that mutant MLK3 may be a novel modulator of WNT signaling, and pinpoint the activation of PCP pathway as a possible mechanism underlying the invasive potential of MLK3 mutant cells.
PMCID: PMC3995575  PMID: 24628919
Colorectal cancer; MLK3; WNT pathway; MSI; Planar cell polarity
8.  Mutation of SALL2 causes recessive ocular coloboma in humans and mice 
Human Molecular Genetics  2014;23(10):2511-2526.
Ocular coloboma is a congenital defect resulting from failure of normal closure of the optic fissure during embryonic eye development. This birth defect causes childhood blindness worldwide, yet the genetic etiology is poorly understood. Here, we identified a novel homozygous mutation in the SALL2 gene in members of a consanguineous family affected with non-syndromic ocular coloboma variably affecting the iris and retina. This mutation, c.85G>T, introduces a premature termination codon (p.Glu29*) predicted to truncate the SALL2 protein so that it lacks three clusters of zinc-finger motifs that are essential for DNA-binding activity. This discovery identifies SALL2 as the third member of the Drosophila homeotic Spalt-like family of developmental transcription factor genes implicated in human disease. SALL2 is expressed in the developing human retina at the time of, and subsequent to, optic fissure closure. Analysis of Sall2-deficient mouse embryos revealed delayed apposition of the optic fissure margins and the persistence of an anterior retinal coloboma phenotype after birth. Sall2-deficient embryos displayed correct posterior closure toward the optic nerve head, and upon contact of the fissure margins, dissolution of the basal lamina occurred and PAX2, known to be critical for this process, was expressed normally. Anterior closure was disrupted with the fissure margins failing to meet, or in some cases misaligning leading to a retinal lesion. These observations demonstrate, for the first time, a role for SALL2 in eye morphogenesis and that loss of function of the gene causes ocular coloboma in humans and mice.
PMCID: PMC3990155  PMID: 24412933
9.  Genome-Wide Methylation and Gene Expression Changes in Newborn Rats following Maternal Protein Restriction and Reversal by Folic Acid 
PLoS ONE  2013;8(12):e82989.
A large body of evidence from human and animal studies demonstrates that the maternal diet during pregnancy can programme physiological and metabolic functions in the developing fetus, effectively determining susceptibility to later disease. The mechanistic basis of such programming is unclear but may involve resetting of epigenetic marks and fetal gene expression. The aim of this study was to evaluate genome-wide DNA methylation and gene expression in the livers of newborn rats exposed to maternal protein restriction. On day one postnatally, there were 618 differentially expressed genes and 1183 differentially methylated regions (FDR 5%). The functional analysis of differentially expressed genes indicated a significant effect on DNA repair/cycle/maintenance functions and of lipid, amino acid metabolism and circadian functions. Enrichment for known biological functions was found to be associated with differentially methylated regions. Moreover, these epigenetically altered regions overlapped genetic loci associated with metabolic and cardiovascular diseases. Both expression changes and DNA methylation changes were largely reversed by supplementing the protein restricted diet with folic acid. Although the epigenetic and gene expression signatures appeared to underpin largely different biological processes, the gene expression profile of DNA methyl transferases was altered, providing a potential link between the two molecular signatures. The data showed that maternal protein restriction is associated with widespread differential gene expression and DNA methylation across the genome, and that folic acid is able to reset both molecular signatures.
PMCID: PMC3877003  PMID: 24391732
10.  Characterization of the intronic portion of cadherin superfamily members, common cancer orchestrators 
Cadherins are cell–cell adhesion proteins essential for the maintenance of tissue architecture and integrity, and their impairment is often associated with human cancer. Knowledge regarding regulatory mechanisms associated with cadherin misexpression in cancer is scarce. Specific features of the intronic-structure and intronic-based regulatory mechanisms in the cadherin superfamily are unidentified. This study aims at systematically characterizing the intronic portion of cadherin superfamily members and the identification of intronic regions constituting putative targets/triggers of regulation, using a bioinformatic approach and biological data mining. Our study demonstrates that the cadherin superfamily genes harbour specific characteristics in comparison to all non-cadherin genes, both from the genomic and transcriptional standpoints. Cadherin superfamily genes display higher average total intron number and significantly longer introns than other genes and across the entire vertebrate lineage. Moreover, in the human genome, we observed an uncommon high frequency of MIR (mammalian-wide interspersed repeats) and MaLR (mammalian-wide interspersed repeats, a subtype of LTR) regulatory-associated repetitive elements at 5′-located introns, concomitantly with increased de novo intronic transcription. Using this approach, we identified cadherin intronic-specific sites that may constitute novel targets/triggers of cadherin superfamily expression regulation. These findings pinpoint the need to identify mechanisms affecting particularly MIR and MaLR elements located in introns 2 and 3 of human cadherin genes, possibly important in the expression modulation of this superfamily in homeostasis and cancer.
PMCID: PMC3400724  PMID: 22317972
cadherin; cancer; intronic-based regulatory elements; MIR; MaLR; transcription
11.  A Strong Anti-Inflammatory Signature Revealed by Liver Transcription Profiling of Tmprss6−/− Mice 
PLoS ONE  2013;8(7):e69694.
Control of systemic iron homeostasis is interconnected with the inflammatory response through the key iron regulator, the antimicrobial peptide hepcidin. We have previously shown that mice with iron deficiency anemia (IDA)-low hepcidin show a pro-inflammatory response that is blunted in iron deficient-high hepcidin Tmprss6 KO mice. The transcriptional response associated with chronic hepcidin overexpression due to genetic inactivation of Tmprss6 is unknown. By using whole genome transcription profiling of the liver and analysis of spleen immune-related genes we identified several functional pathways differentially expressed in Tmprss6 KO mice, compared to IDA animals and thus irrespective of the iron status. In the effort of defining genes potentially targets of Tmprss6 we analyzed liver gene expression changes according to the genotype and independently of treatment. Tmprss6 inactivation causes down-regulation of liver pathways connected to immune and inflammatory response as well as spleen genes related to macrophage activation and inflammatory cytokines production. The anti-inflammatory status of Tmprss6 KO animals was confirmed by the down-regulation of pathways related to immunity, stress response and intracellular signaling in both liver and spleen after LPS treatment. Opposite to Tmprss6 KO mice, Hfe−/− mice are characterized by iron overload with inappropriately low hepcidin levels. Liver expression profiling of Hfe−/− deficient versus iron loaded mice show the opposite expression of some of the genes modulated by the loss of Tmprss6. Altogether our results confirm the anti-inflammatory status of Tmprss6 KO mice and identify new potential target pathways/genes of Tmprss6.
PMCID: PMC3726786  PMID: 23922777
12.  Social Epigenetics and Equality of Opportunity 
Public Health Ethics  2013;6(2):142-153.
Recent epidemiological reports of associations between socioeconomic status and epigenetic markers that predict vulnerability to diseases are bringing to light substantial biological effects of social inequalities. Here, we start the discussion of the moral consequences of these findings. We firstly highlight their explanatory importance in the context of the research program on the Developmental Origins of Health and Disease (DOHaD) and the social determinants of health. In the second section, we review some theories of the moral status of health inequalities. Rather than a complete outline of the debate, we single out those theories that rest on the principle of equality of opportunity and analyze the consequences of DOHaD and epigenetics for these particular conceptions of justice. We argue that DOHaD and epigenetics reshape the conceptual distinction between natural and acquired traits on which these theories rely and might provide important policy tools to tackle unjust distributions of health.
PMCID: PMC3712403  PMID: 23864907
13.  An autoinflammatory neurological disease due to interleukin 6 hypersecretion 
Autoinflammatory diseases are rare illnesses characterized by apparently unprovoked inflammation without high-titer auto-antibodies or antigen-specific T cells. They may cause neurological manifestations, such as meningitis and hearing loss, but they are also characterized by non-neurological manifestations. In this work we studied a 30-year-old man who had a chronic disease characterized by meningitis, progressive hearing loss, persistently raised inflammatory markers and diffuse leukoencephalopathy on brain MRI. He also suffered from chronic recurrent osteomyelitis of the mandible. The hypothesis of an autoinflammatory disease prompted us to test for the presence of mutations in interleukin-1−pathway genes and to investigate the function of this pathway in the mononuclear cells obtained from the patient. Search for mutations in genes associated with interleukin-1−pathway demonstrated a novel NLRP3 (CIAS1) mutation (p.I288M) and a previously described MEFV mutation (p.R761H), but their combination was found to be non-pathogenic. On the other hand, we uncovered a selective interleukin-6 hypersecretion within the central nervous system as the likely pathogenic mechanism. This is also supported by the response to the anti-interleukin-6−receptor monoclonal antibody tocilizumab, but not to the recombinant interleukin-1−receptor antagonist anakinra. Exome sequencing failed to identify mutations in other genes known to be involved in autoinflammatory diseases. We propose that the disease described in this patient might be a prototype of a novel category of autoinflammatory diseases characterized by prominent neurological involvement.
PMCID: PMC3601972  PMID: 23432807
Anakinra; Aseptic meningitis; NLRP3 (CIAS1); Hearing loss; Interleukin-1; Interleukin-6; Leukoencephalopathy; MEFV; Tocilizumab
14.  Highly conserved elements discovered in vertebrates are present in non-syntenic loci of tunicates, act as enhancers and can be transcribed during development 
Nucleic Acids Research  2013;41(6):3600-3618.
Co-option of cis-regulatory modules has been suggested as a mechanism for the evolution of expression sites during development. However, the extent and mechanisms involved in mobilization of cis-regulatory modules remains elusive. To trace the history of non-coding elements, which may represent candidate ancestral cis-regulatory modules affirmed during chordate evolution, we have searched for conserved elements in tunicate and vertebrate (Olfactores) genomes. We identified, for the first time, 183 non-coding sequences that are highly conserved between the two groups. Our results show that all but one element are conserved in non-syntenic regions between vertebrate and tunicate genomes, while being syntenic among vertebrates. Nevertheless, in all the groups, they are significantly associated with transcription factors showing specific functions fundamental to animal development, such as multicellular organism development and sequence-specific DNA binding. The majority of these regions map onto ultraconserved elements and we demonstrate that they can act as functional enhancers within the organism of origin, as well as in cross-transgenesis experiments, and that they are transcribed in extant species of Olfactores. We refer to the elements as ‘Olfactores conserved non-coding elements’.
PMCID: PMC3616699  PMID: 23393190
15.  Characterisation and Validation of Insertions and Deletions in 173 Patient Exomes 
PLoS ONE  2012;7(12):e51292.
Recent advances in genomics technologies have spurred unprecedented efforts in genome and exome re-sequencing aiming to unravel the genetic component of rare and complex disorders. While in rare disorders this allowed the identification of novel causal genes, the missing heritability paradox in complex diseases remains so far elusive. Despite rapid advances of next-generation sequencing, both the technology and the analysis of the data it produces are in its infancy. At present there is abundant knowledge pertaining to the role of rare single nucleotide variants (SNVs) in rare disorders and of common SNVs in common disorders. Although the 1,000 genome project has clearly highlighted the prevalence of rare variants and more complex variants (e.g. insertions, deletions), their role in disease is as yet far from elucidated.
We set out to analyse the properties of sequence variants identified in a comprehensive collection of exome re-sequencing studies performed on samples from patients affected by a broad range of complex and rare diseases (N = 173). Given the known potential for Loss of Function (LoF) variants to be false positive, we performed an extensive validation of the common, rare and private LoF variants identified, which indicated that most of the private and rare variants identified were indeed true, while common novel variants had a significantly higher false positive rate. Our results indicated a strong enrichment of very low-frequency insertion/deletion variants, so far under-investigated, which might be difficult to capture with low coverage and imputation approaches and for which most of study designs would be under-powered. These insertions and deletions might play a significant role in disease genetics, contributing specifically to the underlining rare and private variation predicted to be discovered through next generation sequencing.
PMCID: PMC3522676  PMID: 23251486
16.  ParkDB: a Parkinson’s disease gene expression database 
Parkinson’s disease (PD) is a common, adult-onset, neuro-degenerative disorder characterized by the degeneration of cardinal motor signs mainly due to the loss of dopaminergic neurons in the substantia nigra. To date, researchers still have limited understanding of the key molecular events that provoke neurodegeneration in this disease. Here, we present ParkDB, the first queryable database dedicated to gene expression in PD. ParkDB contains a complete set of re-analyzed, curated and annotated microarray datasets. This resource enables scientists to identify and compare expression signatures involved in PD and dopaminergic neuron differentiation under different biological conditions and across species.
Database URL:
PMCID: PMC3098727  PMID: 21593080
17.  Integrated Genetic and Epigenetic Analysis Identifies Haplotype-Specific Methylation in the FTO Type 2 Diabetes and Obesity Susceptibility Locus 
PLoS ONE  2010;5(11):e14040.
Recent multi-dimensional approaches to the study of complex disease have revealed powerful insights into how genetic and epigenetic factors may underlie their aetiopathogenesis. We examined genotype-epigenotype interactions in the context of Type 2 Diabetes (T2D), focussing on known regions of genomic susceptibility. We assayed DNA methylation in 60 females, stratified according to disease susceptibility haplotype using previously identified association loci. CpG methylation was assessed using methylated DNA immunoprecipitation on a targeted array (MeDIP-chip) and absolute methylation values were estimated using a Bayesian algorithm (BATMAN). Absolute methylation levels were quantified across LD blocks, and we identified increased DNA methylation on the FTO obesity susceptibility haplotype, tagged by the rs8050136 risk allele A (p = 9.40×10−4, permutation p = 1.0×10−3). Further analysis across the 46 kb LD block using sliding windows localised the most significant difference to be within a 7.7 kb region (p = 1.13×10−7). Sequence level analysis, followed by pyrosequencing validation, revealed that the methylation difference was driven by the co-ordinated phase of CpG-creating SNPs across the risk haplotype. This 7.7 kb region of haplotype-specific methylation (HSM), encapsulates a Highly Conserved Non-Coding Element (HCNE) that has previously been validated as a long-range enhancer, supported by the histone H3K4me1 enhancer signature. This study demonstrates that integration of Genome-Wide Association (GWA) SNP and epigenomic DNA methylation data can identify potential novel genotype-epigenotype interactions within disease-associated loci, thus providing a novel route to aid unravelling common complex diseases.
PMCID: PMC2987816  PMID: 21124985
18.  The IKMC web portal: a central point of entry to data and resources from the International Knockout Mouse Consortium 
Nucleic Acids Research  2010;39(Database issue):D849-D855.
The International Knockout Mouse Consortium (IKMC) aims to mutate all protein-coding genes in the mouse using a combination of gene targeting and gene trapping in mouse embryonic stem (ES) cells and to make the generated resources readily available to the research community. The IKMC database and web portal ( serves as the central public web site for IKMC data and facilitates the coordination and prioritization of work within the consortium. Researchers can access up-to-date information on IKMC knockout vectors, ES cells and mice for specific genes, and follow links to the respective repositories from which corresponding IKMC products can be ordered. Researchers can also use the web site to nominate genes for targeting, or to indicate that targeting of a gene should receive high priority. The IKMC database provides data to, and features extensive interconnections with, other community databases.
PMCID: PMC3013768  PMID: 20929875
19.  Promiscuity of enhancer, coding and non-coding transcription functions in ultraconserved elements 
BMC Genomics  2010;11:151.
Ultraconserved elements (UCEs) are highly constrained elements of mammalian genomes, whose functional role has not been completely elucidated yet. Previous studies have shown that some of them act as enhancers in mouse, while some others are expressed in both normal and cancer-derived human tissues. Only one UCE element so far was shown to present these two functions concomitantly, as had been observed in other isolated instances of single, non ultraconserved enhancer elements.
We used a custom microarray to assess the levels of UCE transcription during mouse development and integrated these data with published microarray and next-generation sequencing datasets as well as with newly produced PCR validation experiments. We show that a large fraction of non-exonic UCEs is transcribed across all developmental stages examined from only one DNA strand. Although the nature of these transcripts remains a mistery, our meta-analysis of RNA-Seq datasets indicates that they are unlikely to be short RNAs and that some of them might encode nuclear transcripts. In the majority of cases this function overlaps with the already established enhancer function of these elements during mouse development. Utilizing several next-generation sequencing datasets, we were further able to show that the level of expression observed in non-exonic UCEs is significantly higher than in random regions of the genome and that this is also seen in other regions which act as enhancers.
Our data shows that the concurrent presence of enhancer and transcript function in non-exonic UCE elements is more widespread than previously shown. Moreover through our own experiments as well as the use of next-generation sequencing datasets, we were able to show that the RNAs encoded by non-exonic UCEs are likely to be long RNAs transcribed from only one DNA strand.
PMCID: PMC2847969  PMID: 20202189
20.  Mixed lineage kinase 3 gene mutations in mismatch repair deficient gastrointestinal tumours 
Human Molecular Genetics  2009;19(4):697-706.
Mixed lineage kinase 3 (MLK3) is a serine/threonine kinase, regulating MAPkinase signalling, in which cancer-associated mutations have never been reported. In this study, 174 primary gastrointestinal cancers (48 hereditary and 126 sporadic forms) and 7 colorectal cancer cell lines were screened for MLK3 mutations. MLK3 mutations were significantly associated with MSI phenotype in primary tumours (P = 0.0005), occurring in 21% of the MSI carcinomas. Most MLK3 somatic mutations identified were of the missense type (62.5%) and more than 80% of them affected evolutionarily conserved residues. A predictive 3D model points to the functional relevance of MLK3 missense mutations, which cluster in the kinase domain. Further, the model shows that most of the altered residues in the kinase domain probably affect MLK3 scaffold properties, instead of its kinase activity. MLK3 missense mutations showed transforming capacity in vitro and cells expressing the mutant gene were able to develop locally invasive tumours, when subcutaneously injected in nude mice. Interestingly, in primary tumours, MLK3 mutations occurred in KRAS and/or BRAF wild-type carcinomas, although not being mutually exclusive genetic events. In conclusion, we have demonstrated for the first time the presence of MLK3 mutations in cancer and its association to mismatch repair deficiency. Further, we demonstrated that MLK3 missense mutations found in MSI gastrointestinal carcinomas are functionally relevant.
PMCID: PMC2807374  PMID: 19955118
21.  PRGdb: a bioinformatics platform for plant resistance gene analysis 
Nucleic Acids Research  2009;38(Database issue):D814-D821.
PRGdb is a web accessible open-source ( database that represents the first bioinformatic resource providing a comprehensive overview of resistance genes (R-genes) in plants. PRGdb holds more than 16 000 known and putative R-genes belonging to 192 plant species challenged by 115 different pathogens and linked with useful biological information. The complete database includes a set of 73 manually curated reference R-genes, 6308 putative R-genes collected from NCBI and 10463 computationally predicted putative R-genes. Thanks to a user-friendly interface, data can be examined using different query tools. A home-made prediction pipeline called Disease Resistance Analysis and Gene Orthology (DRAGO), based on reference R-gene sequence data, was developed to search for plant resistance genes in public datasets such as Unigene and Genbank. New putative R-gene classes containing unknown domain combinations were discovered and characterized. The development of the PRG platform represents an important starting point to conduct various experimental tasks. The inferred cross-link between genomic and phenotypic information allows access to a large body of information to find answers to several biological questions. The database structure also permits easy integration with other data types and opens up prospects for future implementations.
PMCID: PMC2808903  PMID: 19906694
22.  Germline CDH1 deletions in hereditary diffuse gastric cancer families 
Human Molecular Genetics  2009;18(9):1545-1555.
Germline CDH1 point or small frameshift mutations can be identified in 30–50% of hereditary diffuse gastric cancer (HDGC) families. We hypothesized that CDH1 genomic rearrangements would be found in HDGC and identified 160 families with either two gastric cancers in first-degree relatives and with at least one diffuse gastric cancer (DGC) diagnosed before age 50, or three or more DGC in close relatives diagnosed at any age. Sixty-seven carried germline CDH1 point or small frameshift mutations. We screened germline DNA from the 93 mutation negative probands for large genomic rearrangements by Multiplex Ligation-Dependent Probe Amplification. Potential deletions were validated by RT–PCR and breakpoints cloned using a combination of oligo-CGH-arrays and long-range-PCR. In-silico analysis of the CDH1 locus was used to determine a potential mechanism for these rearrangements. Six of 93 (6.5%) previously described mutation negative HDGC probands, from low GC incidence populations (UK and North America), carried genomic deletions (UK and North America). Two families carried an identical deletion spanning 193 593 bp, encompassing the full CDH3 sequence and CDH1 exons 1 and 2. Other deletions affecting exons 1, 2, 15 and/or 16 were identified. The statistically significant over-representation of Alus around breakpoints indicates it as a likely mechanism for these deletions. When all mutations and deletions are considered, the overall frequency of CDH1 alterations in HDGC is ∼46% (73/160). CDH1 large deletions occur in 4% of HDGC families by mechanisms involving mainly non-allelic homologous recombination in Alu repeat sequences. As the finding of pathogenic CDH1 mutations is useful for management of HDGC families, screening for deletions should be offered to at-risk families.
PMCID: PMC2667284  PMID: 19168852
23.  The UniTrap resource: tools for the biologist enabling optimized use of gene trap clones 
Nucleic Acids Research  2007;36(Database issue):D741-D746.
We have developed a comprehensive resource devoted to biologists wanting to optimize the use of gene trap clones in their experiments. We have processed 300 602 such clones from both public and private projects to generate 28 199 ‘UniTraps’, i.e. distinct collections of unambiguous insertions at the same subgenic region of annotated genes. The UniTrap resource contains data relative to 9583 trapped genes, which represent 42.3% of the mouse gene content. Among the trapped genes, 7 728 have a counterpart in humans, and 677 are known to be involved in the pathogenesis of human diseases. The aim of this analysis is to provide the wet lab researchers with a comprehensive database and curated tools for (i) identifying and comparing the clones carrying a trap into the genes of interest, (ii) evaluating the severity of the mutation to the protein function in each independent trapping event and (iii) supplying complete information to perform PCR, RT-PCR and restriction experiments to verify the clone and identify the exact point of vector insertion. To share this unique resource with the scientific community, we have designed and implemented a web interface that is freely accessible at
PMCID: PMC2238955  PMID: 17942430
24.  The TATA-binding protein regulates maternal mRNA degradation and differential zygotic transcription in zebrafish 
The EMBO Journal  2007;26(17):3945-3956.
Early steps of embryo development are directed by maternal gene products and trace levels of zygotic gene activity in vertebrates. A major activation of zygotic transcription occurs together with degradation of maternal mRNAs during the midblastula transition in several vertebrate systems. How these processes are regulated in preparation for the onset of differentiation in the vertebrate embryo is mostly unknown. Here, we studied the function of TATA-binding protein (TBP) by knock down and DNA microarray analysis of gene expression in early embryo development. We show that a subset of polymerase II-transcribed genes with ontogenic stage-dependent regulation requires TBP for their zygotic activation. TBP is also required for limiting the activation of genes during development. We reveal that TBP plays an important role in the degradation of a specific subset of maternal mRNAs during late blastulation/early gastrulation, which involves targets of the miR-430 pathway. Hence, TBP acts as a specific regulator of the key processes underlying the transition from maternal to zygotic regulation of embryogenesis. These results implicate core promoter recognition as an additional level of differential gene regulation during development.
PMCID: PMC1950726  PMID: 17703193
maternal mRNA; MBT; TBP; transcription; zebrafish
25.  BASC: an integrated bioinformatics system for Brassica research 
Nucleic Acids Research  2006;35(Database issue):D870-D873.
The BASC system provides tools for the integrated mining and browsing of genetic, genomic and phenotypic data. This public resource hosts information on Brassica species supporting the Multinational Brassica Genome Sequencing Project, and is based upon five distinct modules, ESTDB, Microarray, MarkerQTL, CMap and EnsEMBL. ESTDB hosts expressed gene sequences and related annotation derived from comparison with GenBank, UniRef and the genome sequence of Arabidopsis. The Microarray module hosts gene expression information related to genes annotated within ESTDB. MarkerQTL is the most complex module and integrates information on genetic markers, maps, individuals, genotypes and traits. Two further modules include an Arabidopsis EnsEMBL genome viewer and the CMap comparative genetic map viewer for the visualization and integration of genetic and genomic data. The database is accessible at .
PMCID: PMC1761444  PMID: 17148473

Results 1-25 (27)