PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-9 (9)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
1.  CIRCUS: a package for Circos display of structural genome variations from paired-end and mate-pair sequencing data 
BMC Bioinformatics  2014;15:198.
Background
Detection of large genomic rearrangements, such as large indels, duplications or translocations is now commonly achieved by next generation sequencing (NGS) approaches. Recently, several tools have been developed to analyze NGS data but the resulting files are difficult to interpret without an additional visualization step. Circos (Genome Res, 19:1639–1645, 2009), a Perl script, is a powerful visualization software that requires setting up numerous configuration files with a large number of parameters to handle. R packages like RCircos (BMC Bioinformatics, 14:244, 2013) or ggbio (Genome Biol, 13:R77, 2012) provide functions to display genomic data as circular Circos-like plots. However, these tools are very general and lack the functions needed to filter, format and adjust specific input genomic data.
Results
We implemented an R package called CIRCUS to analyze genomic structural variations. It generates both data and configuration files necessary for Circos, to produce graphs. Only few R pre-requisites are necessary. Options are available to deal with heterogeneous data, various chromosome numbers and multi-scale analysis.
Conclusion
CIRCUS allows fast and versatile analysis of genomic structural variants with Circos plots for users with limited coding skills.
doi:10.1186/1471-2105-15-198
PMCID: PMC4071023  PMID: 24938393
Circos; Genomic structural variants; Genomic data visualization
2.  BRASERO: A Resource for Benchmarking RNA Secondary Structure Comparison Algorithms 
Advances in Bioinformatics  2012;2012:893048.
The pairwise comparison of RNA secondary structures is a fundamental problem, with direct application in mining databases for annotating putative noncoding RNA candidates in newly sequenced genomes. An increasing number of software tools are available for comparing RNA secondary structures, based on different models (such as ordered trees or forests, arc annotated sequences, and multilevel trees) and computational principles (edit distance, alignment). We describe here the website BRASERO that offers tools for evaluating such software tools on real and synthetic datasets.
doi:10.1155/2012/893048
PMCID: PMC3366197  PMID: 22675348
3.  Replication Fork Polarity Gradients Revealed by Megabase-Sized U-Shaped Replication Timing Domains in Human Cell Lines 
PLoS Computational Biology  2012;8(4):e1002443.
In higher eukaryotes, replication program specification in different cell types remains to be fully understood. We show for seven human cell lines that about half of the genome is divided in domains that display a characteristic U-shaped replication timing profile with early initiation zones at borders and late replication at centers. Significant overlap is observed between U-domains of different cell lines and also with germline replication domains exhibiting a N-shaped nucleotide compositional skew. From the demonstration that the average fork polarity is directly reflected by both the compositional skew and the derivative of the replication timing profile, we argue that the fact that this derivative displays a N-shape in U-domains sustains the existence of large-scale gradients of replication fork polarity in somatic and germline cells. Analysis of chromatin interaction (Hi-C) and chromatin marker data reveals that U-domains correspond to high-order chromatin structural units. We discuss possible models for replication origin activation within U/N-domains. The compartmentalization of the genome into replication U/N-domains provides new insights on the organization of the replication program in the human genome.
Author Summary
DNA replication in human cells requires the parallel progression along the genome of thousands of replication machineries. Comprehensive knowledge of genetic inheritance at different development stages relies on elucidating the mechanisms that regulate the location and progression of these machineries throughout the duration of the DNA synthetic phase of the cell cycle. Here, we determine in multiple human cell types the existence of a new type of megabase-sized replication domains across which the average orientation of the replication machinery changes in a linear manner. These domains are revealed in 7 somatic cell types by a U-shaped pattern in the replication timing profiles as well as by N-shaped patterns in the DNA compositional asymmetry profile reflecting the existence of a replication-associated mutational asymmetry in the germline. These domains therefore correspond to a robust mode of replication across cell types and during evolution. Using genome-wide data on the frequency of interaction of distant chromatin segments in two cell lines, we find that these U/N-replication domains remarkably correspond to self-interacting folding units of the chromatin fiber.
doi:10.1371/journal.pcbi.1002443
PMCID: PMC3320577  PMID: 22496629
4.  Evidence for Sequential and Increasing Activation of Replication Origins along Replication Timing Gradients in the Human Genome 
PLoS Computational Biology  2011;7(12):e1002322.
Genome-wide replication timing studies have suggested that mammalian chromosomes consist of megabase-scale domains of coordinated origin firing separated by large originless transition regions. Here, we report a quantitative genome-wide analysis of DNA replication kinetics in several human cell types that contradicts this view. DNA combing in HeLa cells sorted into four temporal compartments of S phase shows that replication origins are spaced at 40 kb intervals and fire as small clusters whose synchrony increases during S phase and that replication fork velocity (mean 0.7 kb/min, maximum 2.0 kb/min) remains constant and narrowly distributed through S phase. However, multi-scale analysis of a genome-wide replication timing profile shows a broad distribution of replication timing gradients with practically no regions larger than 100 kb replicating at less than 2 kb/min. Therefore, HeLa cells lack large regions of unidirectional fork progression. Temporal transition regions are replicated by sequential activation of origins at a rate that increases during S phase and replication timing gradients are set by the delay and the spacing between successive origin firings rather than by the velocity of single forks. Activation of internal origins in a specific temporal transition region is directly demonstrated by DNA combing of the IGH locus in HeLa cells. Analysis of published origin maps in HeLa cells and published replication timing and DNA combing data in several other cell types corroborate these findings, with the interesting exception of embryonic stem cells where regions of unidirectional fork progression seem more abundant. These results can be explained if origins fire independently of each other but under the control of long-range chromatin structure, or if replication forks progressing from early origins stimulate initiation in nearby unreplicated DNA. These findings shed a new light on the replication timing program of mammalian genomes and provide a general model for their replication kinetics.
Author Summary
Eukaryotic chromosomes replicate from multiple replication origins that fire at different times in S phase. The mechanisms that specify origin position and firing time and coordinate origins to ensure complete genome duplication are unclear. Previous studies proposed either that origins are arranged in temporally coordinated groups or fire independently of each other in a stochastic manner. Here, we have performed a quantitative analysis of human genome replication kinetics using a combination of DNA combing, which reveals local patterns of origin firing and replication fork progression on single DNA molecules, and massive sequencing of newly replicated DNA, which reveals the population-averaged replication timing profile of the entire genome. We show that origins are activated synchronously in large regions of uniform replication timing but more gradually in temporal transition regions and that the rate of origin firing increases as replication progresses. Large regions of unidirectional fork progression are abundant in embryonic stem cells but rare in differentiated cells. We propose a model in which replication forks progressing from early origins stimulate initiation in nearby unreplicated DNA in a manner that explains the shape of the replication timing profile. These results provide a fundamental insight into the temporal regulation of mammalian genome replication.
doi:10.1371/journal.pcbi.1002322
PMCID: PMC3248390  PMID: 22219720
5.  Open chromatin encoded in DNA sequence is the signature of ‘master’ replication origins in human cells 
Nucleic Acids Research  2009;37(18):6064-6075.
For years, progress in elucidating the mechanisms underlying replication initiation and its coupling to transcriptional activities and to local chromatin structure has been hampered by the small number (approximately 30) of well-established origins in the human genome and more generally in mammalian genomes. Recent in silico studies of compositional strand asymmetries revealed a high level of organization of human genes around 1000 putative replication origins. Here, by comparing with recently experimentally identified replication origins, we provide further support that these putative origins are active in vivo. We show that regions ∼300-kb wide surrounding most of these putative replication origins that replicate early in the S phase are hypersensitive to DNase I cleavage, hypomethylated and present a significant enrichment in genomic energy barriers that impair nucleosome formation (nucleosome-free regions). This suggests that these putative replication origins are specified by an open chromatin structure favored by the DNA sequence. We discuss how this distinctive attribute makes these origins, further qualified as ‘master’ replication origins, priviledged loci for future research to decipher the human spatio-temporal replication program. Finally, we argue that these ‘master’ origins are likely to play a key role in genome dynamics during evolution and in pathological situations.
doi:10.1093/nar/gkp631
PMCID: PMC2764438  PMID: 19671527
6.  Neurodevelopment Genes in Lampreys Reveal Trends for Forebrain Evolution in Craniates 
PLoS ONE  2009;4(4):e5374.
The forebrain is the brain region which has undergone the most dramatic changes through vertebrate evolution. Analyses conducted in lampreys are essential to gain insight into the broad ancestral characteristics of the forebrain at the dawn of vertebrates, and to understand the molecular basis for the diversifications that have taken place in cyclostomes and gnathostomes following their splitting. Here, we report the embryonic expression patterns of 43 lamprey genes, coding for transcription factors or signaling molecules known to be involved in cell proliferation, stemcellness, neurogenesis, patterning and regionalization in the developing forebrain. Systematic expression patterns comparisons with model organisms highlight conservations likely to reflect shared features present in the vertebrate ancestors. They also point to changes in signaling systems –pathways which control the growth and patterning of the neuroepithelium-, which may have been crucial in the evolution of forebrain anatomy at the origin of vertebrates.
doi:10.1371/journal.pone.0005374
PMCID: PMC2671401  PMID: 19399187
7.  DNA physical properties determine nucleosome occupancy from yeast to fly 
Nucleic Acids Research  2008;36(11):3746-3756.
Nucleosome positioning plays an essential role in cellular processes by modulating accessibility of DNA to proteins. Here, using only sequence-dependent DNA flexibility and intrinsic curvature, we predict the nucleosome occupancy along the genomes of Saccharomyces cerevisiae and Drosophila melanogaster and demonstrate the predictive power and universality of our model through its correlation with experimentally determined nucleosome occupancy data. In yeast promoter regions, the computed average nucleosome occupancy closely superimposes with experimental data, exhibiting a <200 bp region unfavourable for nucleosome formation bordered by regions that facilitate nucleosome formation. In the fly, our model faithfully predicts promoter strength as encoded in distinct chromatin architectures characteristic of strongly and weakly expressed genes. We also predict that nucleosomes are repositioned by active mechanisms at the majority of fly promoters. Our model uses only basic physical properties to describe the wrapping of DNA around the histone core, yet it captures a substantial part of chromatin's structural complexity, thus leading to a much better prediction of nucleosome occupancy than methods based merely on periodic curved DNA motifs. Our results indicate that the physical properties of the DNA chain, and not just the regulatory factors and chromatin-modifying enzymes, play key roles in eukaryotic transcription.
doi:10.1093/nar/gkn262
PMCID: PMC2441789  PMID: 18487627
8.  Wavelet Analysis of DNA Bending Profiles reveals Structural Constraints on the Evolution of Genomic Sequences 
Journal of Biological Physics  2004;30(1):33-81.
Analyses of genomic DNA sequences have shown in previous works that base pairs are correlated at large distances with scale-invariant statistical properties. We show in the present study that these correlations between nucleotides (letters) result in fact from long-range correlations (LRC) between sequence-dependent DNA structural elements (words) involved in the packaging of DNA in chromatin. Using the wavelet transform technique, we perform a comparative analysis of the DNA text and of the corresponding bending profiles generated with curvature tables based on nucleosome positioning data. This exploration through the optics of the so-called `wavelet transform microscope' reveals a characteristic scale of 100-200 bp that separates two regimes of different LRC. We focus here on the existence of LRC in the small-scale regime (≲ 200 bp). Analysis of genomes in the three kingdoms reveals that this regime is specifically associated to the presence of nucleosomes. Indeed, small scale LRC are observed in eukaryotic genomes and to a less extent in archaeal genomes, in contrast with their absence in eubacterial genomes. Similarly, this regime is observed in eukaryotic but not in bacterial viral DNA genomes. There is one exception for genomes of Poxviruses, the only animal DNA viruses that do not replicate in the cell nucleus and do not present small scale LRC. Furthermore, no small scale LRC are detected in the genomes of all examined RNA viruses, with one exception in the case of retroviruses. Altogether, these results strongly suggest that small-scale LRC are a signature of the nucleosomal structure. Finally, we discuss possible interpretations of these small-scale LRC in terms of the mechanisms that govern the positioning, the stability and the dynamics of the nucleosomes along the DNA chain. This paper is maily devoted to a pedagogical presentation of the theoretical concepts and physical methods which are well suited to perform a statistical analysis of genomic sequences. We review the results obtained with the so-called wavelet-based multifractal analysis when investigating the DNA sequences of various organisms in the three kingdoms. Some of these results have been announced in B. Audit et al. [1, 2].
doi:10.1023/B:JOBP.0000016438.86794.8e
PMCID: PMC3456503  PMID: 23345861
chromatin; DNA bending profile; fractals; genomic DNA sequence; long-range correlations; nucleosome; scale-invariance; wavelet transform
9.  Transcription-coupled and splicing-coupled strand asymmetries in eukaryotic genomes 
Nucleic Acids Research  2004;32(17):4969-4978.
Under no-strand bias conditions, each genomic DNA strand should present equimolarities of A and T and of G and C. Deviations from these rules are attributed to asymmetric properties intrinsic to DNA mutation–repair processes. In bacteria, strand biases are associated with replication or transcription. In eukaryotes, recent studies demonstrate that human genes present transcription-coupled biases that might reflect transcription-coupled repair processes. Here, we study strand asymmetries in intron sequences of evolutionarily distant eukaryotes, and show that two superimposed intron biases can be distinguished. (i) Biases that are maximum at intron extremities and decrease over large distances to zero values in internal regions, possibly reflecting interactions between pre-mRNA and splicing machinery; these extend over ∼0.5 kb in mammals and Arabidopsis thaliana, and over 1 kb in Caenorhabditis elegans and Drosophila melanogaster. (ii) Biases that are constant along introns, possibly associated with transcription. Strikingly, in C.elegans, these latter biases extend over intergenic regions that separate co-oriented genes. When appropriately examined, all genomes present transcription-coupled excess of T over A in the coding strand. On the opposite, GC skews are either positive (mammals, plants) or negative (invertebrates). These results suggest that transcription-coupled asymmetries result from mutation–repair mechanisms that differ between vertebrates and invertebrates.
doi:10.1093/nar/gkh823
PMCID: PMC521644  PMID: 15388799

Results 1-9 (9)