Search tips
Search criteria

Results 1-6 (6)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Evidence for Transcript Networks Composed of Chimeric RNAs in Human Cells 
PLoS ONE  2012;7(1):e28213.
The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 5′ and 3′ transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network.
PMCID: PMC3251577  PMID: 22238572
2.  Efficient targeted transcript discovery via array-based normalization of RACE libraries 
Nature methods  2008;5(7):629-635.
RACE (Rapid Amplification of cDNA Ends) is a widely used approach for transcript identification. Random clone selection from the RACE mixture, however, is an ineffective sampling strategy if the dynamic range of transcript abundances is large. Here, we describe a strategy that uses array hybridization to improve sampling efficiency of human transcripts. The products of the RACE reaction are hybridized onto tiling arrays, and the exons detected are used to delineate a series of RT-PCR reactions, through which the original RACE mixture is segregated into simpler RT-PCR reactions. These are independently cloned, and randomly selected clones are sequenced. This approach is superior to direct cloning and sequencing of RACE products: it specifically targets novel transcripts, and often results in overall normalization of transcript abundances. We show theoretically and experimentally that this strategy leads indeed to efficient sampling of novel transcripts, and we investigate multiplexing it by pooling RACE reactions from multiple interrogated loci prior to hybridization.
PMCID: PMC2713501  PMID: 18500348
3.  Disease-Causing 7.4 kb Cis-Regulatory Deletion Disrupting Conserved Non-Coding Sequences and Their Interaction with the FOXL2 Promotor: Implications for Mutation Screening 
PLoS Genetics  2009;5(6):e1000522.
To date, the contribution of disrupted potentially cis-regulatory conserved non-coding sequences (CNCs) to human disease is most likely underestimated, as no systematic screens for putative deleterious variations in CNCs have been conducted. As a model for monogenic disease we studied the involvement of genetic changes of CNCs in the cis-regulatory domain of FOXL2 in blepharophimosis syndrome (BPES). Fifty-seven molecularly unsolved BPES patients underwent high-resolution copy number screening and targeted sequencing of CNCs. Apart from three larger distant deletions, a de novo deletion as small as 7.4 kb was found at 283 kb 5′ to FOXL2. The deletion appeared to be triggered by an H-DNA-induced double-stranded break (DSB). In addition, it disrupts a novel long non-coding RNA (ncRNA) PISRT1 and 8 CNCs. The regulatory potential of the deleted CNCs was substantiated by in vitro luciferase assays. Interestingly, Chromosome Conformation Capture (3C) of a 625 kb region surrounding FOXL2 in expressing cellular systems revealed physical interactions of three upstream fragments and the FOXL2 core promoter. Importantly, one of these contains the 7.4 kb deleted fragment. Overall, this study revealed the smallest distant deletion causing monogenic disease and impacts upon the concept of mutation screening in human disease and developmental disorders in particular.
Author Summary
Long-range genetic control is an inherent feature of genes harbouring a highly complex spatiotemporal expression pattern, requiring a combined action of multiple cis-regulatory elements such as promoters, enhancers, and silencers. Consequently, disruption of the long-range genetic control of a target gene by genomic rearrangements of regulatory elements may lead to aberrant gene transcription and disease. To date, the contribution of mutated regulatory elements to human disease has not been studied frequently. Here, we explored the contribution of genetic changes in potentially cis-regulatory elements of the FOXL2 gene in blepharophimosis syndrome (BPES), a developmental monogenic condition of the eyelids and ovaries. We identified a de novo very subtle deletion of 7.4 kb causing BPES. Moreover, we studied the functional capacities and chromosome conformation of the deleted region in FOXL2 expressing cellular systems. Interestingly, the chromosome conformation analysis demonstrated the close proximity of the 7.4 kb deleted fragment and two other conserved regions with the FOXL2 core promoter, and the necessity of their integrity for correct FOXL2 expression. Finally, our study revealed the smallest distant deletion causing monogenic disease and emphasized the importance of mutation screening of cis-regulatory elements in human genetic disease.
PMCID: PMC2689649  PMID: 19543368
4.  EGASP: the human ENCODE Genome Annotation Assessment Project 
Genome Biology  2006;7(Suppl 1):S2.
We present the results of EGASP, a community experiment to assess the state-of-the-art in genome annotation within the ENCODE regions, which span 1% of the human genome sequence. The experiment had two major goals: the assessment of the accuracy of computational methods to predict protein coding genes; and the overall assessment of the completeness of the current human genome annotations as represented in the ENCODE regions. For the computational prediction assessment, eighteen groups contributed gene predictions. We evaluated these submissions against each other based on a 'reference set' of annotations generated as part of the GENCODE project. These annotations were not available to the prediction groups prior to the submission deadline, so that their predictions were blind and an external advisory committee could perform a fair assessment.
The best methods had at least one gene transcript correctly predicted for close to 70% of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into account alternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotide level, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programs relying on mRNA and protein sequences were the most accurate in reproducing the manually curated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could be verified.
This is the first such experiment in human DNA, and we have followed the standards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe the results presented here contribute to the value of ongoing large-scale annotation projects and should guide further experimental methods when being scaled up to the entire human genome sequence.
PMCID: PMC1810551  PMID: 16925836
5.  GENCODE: producing a reference annotation for ENCODE 
Genome Biology  2006;7(Suppl 1):S4.
The GENCODE consortium was formed to identify and map all protein-coding genes within the ENCODE regions. This was achieved by a combination of initial manual annotation by the HAVANA team, experimental validation by the GENCODE consortium and a refinement of the annotation based on these experimental results.
The GENCODE gene features are divided into eight different categories of which only the first two (known and novel coding sequence) are confidently predicted to be protein-coding genes. 5' rapid amplification of cDNA ends (RACE) and RT-PCR were used to experimentally verify the initial annotation. Of the 420 coding loci tested, 229 RACE products have been sequenced. They supported 5' extensions of 30 loci and new splice variants in 50 loci. In addition, 46 loci without evidence for a coding sequence were validated, consisting of 31 novel and 15 putative transcripts. We assessed the comprehensiveness of the GENCODE annotation by attempting to validate all the predicted exon boundaries outside the GENCODE annotation. Out of 1,215 tested in a subset of the ENCODE regions, 14 novel exon pairs were validated, only two of them in intergenic regions.
In total, 487 loci, of which 434 are coding, have been annotated as part of the GENCODE reference set available from the UCSC browser. Comparison of GENCODE annotation with RefSeq and ENSEMBL show only 40% of GENCODE exons are contained within the two sets, which is a reflection of the high number of alternative splice forms with unique exons annotated. Over 50% of coding loci have been experimentally verified by 5' RACE for EGASP and the GENCODE collaboration is continuing to refine its annotation of 1% human genome with the aid of experimental validation.
PMCID: PMC1810553  PMID: 16925838
6.  Evolutionary Comparison Provides Evidence for Pathogenicity of RMRP Mutations 
PLoS Genetics  2005;1(4):e47.
Cartilage-hair hypoplasia (CHH) is a pleiotropic disease caused by recessive mutations in the RMRP gene that result in a wide spectrum of manifestations including short stature, sparse hair, metaphyseal dysplasia, anemia, immune deficiency, and increased incidence of cancer. Molecular diagnosis of CHH has implications for management, prognosis, follow-up, and genetic counseling of affected patients and their families. We report 20 novel mutations in 36 patients with CHH and describe the associated phenotypic spectrum. Given the high mutational heterogeneity (62 mutations reported to date), the high frequency of variations in the region (eight single nucleotide polymorphisms in and around RMRP), and the fact that RMRP is not translated into protein, prediction of mutation pathogenicity is difficult. We addressed this issue by a comparative genomic approach and aligned the genomic sequences of RMRP gene in the entire class of mammals. We found that putative pathogenic mutations are located in highly conserved nucleotides, whereas polymorphisms are located in non-conserved positions. We conclude that the abundance of variations in this small gene is remarkable and at odds with its high conservation through species; it is unclear whether these variations are caused by a high local mutation rate, a failure of repair mechanisms, or a relaxed selective pressure. The marked diversity of mutations in RMRP and the low homozygosity rate in our patient population indicate that CHH is more common than previously estimated, but may go unrecognized because of its variable clinical presentation. Thus, RMRP molecular testing may be indicated in individuals with isolated metaphyseal dysplasia, anemia, or immune dysregulation.
Cartilage-hair hypoplasia is a genetic condition named after two of its most conspicuous features, short bones and sparse hair, but it affects blood-forming tissues, immune system, and intestine. It is caused by sequence mutations in RMRP, a small gene that codes for a structural RNA component of an RNAse complex whose biological functions have been elusive so far. The small RMRP gene carries a surprisingly high number of sequence variations, and because its transcript is not translated into protein and its function in the cell is still unclear, distinction between harmless variants and disease-causing mutations (more than 60 have been found so far by the authors and others) is difficult. The authors have sequenced the RMRP gene in several species covering the whole class of mammals and found that the gene is remarkably conserved between species. Interestingly, mutations occurring in conserved (probably functionally important) regions of the gene appear to be disease-producing, whereas those occurring in regions where evolution is more relaxed seem to be harmless variants. These results will help in counseling affected individuals and their families, and may lead to the discovery of the real function of this mysterious gene.
PMCID: PMC1262189  PMID: 16244706

Results 1-6 (6)