Search tips
Search criteria

Results 1-6 (6)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Fiji - an Open Source platform for biological image analysis 
Nature methods  2012;9(7):10.1038/nmeth.2019.
Fiji is a distribution of the popular Open Source software ImageJ focused on biological image analysis. Fiji uses modern software engineering practices to combine powerful software libraries with a broad range of scripting languages to enable rapid prototyping of image processing algorithms. Fiji facilitates the transformation of novel algorithms into ImageJ plugins that can be shared with end users through an integrated update system. We propose Fiji as a platform for productive collaboration between computer science and biology research communities.
PMCID: PMC3855844  PMID: 22743772
2.  Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images 
Bioinformatics  2009;26(6):761-769.
Motivation: Recent advancements in high-throughput imaging have created new large datasets with tens of thousands of gene expression images. Methods for capturing these spatial and/or temporal expression patterns include in situ hybridization or fluorescent reporter constructs or tags, and results are still frequently assessed by subjective qualitative comparisons. In order to deal with available large datasets, fully automated analysis methods must be developed to properly normalize and model spatial expression patterns.
Results: We have developed image segmentation and registration methods to identify and extract spatial gene expression patterns from RNA in situ hybridization experiments of Drosophila embryos. These methods allow us to normalize and extract expression information for 78 621 images from 3724 genes across six time stages. The similarity between gene expression patterns is computed using four scoring metrics: mean squared error, Haar wavelet distance, mutual information and spatial mutual information (SMI). We additionally propose a strategy to calculate the significance of the similarity between two expression images, by generating surrogate datasets with similar spatial expression patterns using a Monte Carlo swap sampler. On data from an early development time stage, we show that SMI provides the most biologically relevant metric of comparison, and that our significance testing generalizes metrics to achieve similar performance. We exemplify the application of spatial metrics on the well-known Drosophila segmentation network.
Availability: A Java webstart application to register and compare patterns, as well as all source code, are available from:
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC3140183  PMID: 19942587
3.  Sequence Finishing and Mapping of Drosophila melanogaster Heterochromatin 
Science (New York, N.Y.)  2007;316(5831):1625-1628.
Genome sequences for most metazoans and plants are incomplete because of the presence of repeated DNA in the heterochromatin. The heterochromatic regions of Drosophila melanogaster contain 20 million bases (Mb) of sequence amenable to mapping, sequence assembly, and finishing. We describe the generation of 15 Mb of finished or improved heterochromatic sequence with the use of available clone resources and assembly methods. We also constructed a bacterial artificial chromosome–based physical map that spans 13 Mb of the pericentromeric heterochromatin and a cytogenetic map that positions 11 Mb in specific chromosomal locations. We have approached a complete assembly and mapping of the nonsatellite component of Drosophila heterochromatin. The strategy we describe is also applicable to generating substantially more information about heterochromatin in other species, including humans.
PMCID: PMC2825053  PMID: 17569867
4.  Systematic image-driven analysis of the spatial Drosophila embryonic expression landscape 
We created innovative virtual representation for our large scale Drosophila insitu expression dataset. We aligned an elliptically shaped mesh comprised of small triangular regions to the outline of each embryo. Each triangle defines a unique location in the embryo and comparing corresponding triangles allows easy identification of similar expression patterns.The virtual representation was used to organize the expression landscape at stage 4-6. We identified regions with similar expression in the embryo and clustered genes with similar expression patterns.We created algorithms to mine the dataset for adjacent non-overlapping patterns and anti-correlated patterns. We were able to mine the dataset to identify co-expressed and putative interacting genes.Using co-expression we were able to assign putative functions to unknown genes.
Analyzing both temporal and spatial gene expression is essential for understanding development and regulatory networks of multicellular organisms. Interacting genes are commonly expressed in overlapping or adjacent domains. Thus, gene expression patterns can be used to assign putative gene functions and mined to infer candidates for networks.
We have generated a systematic two-dimensional mRNA expression atlas profiling embryonic development of Drosophila melanogaster (Tomancak et al, 2002, 2007). To date, we have collected over 70 000 images for over 6000 genes. To explore spatial relationships between gene expression patterns, we used a novel computational image-processing approach by converting expression patterns from the images into virtual representations (Figure 1). Using a custom-designed automated pipeline, for each image, we segmented and aligned the outline of the embryo to an elliptically shaped mesh, comprised of 311 small triangular regions each defining a unique location within the embryo. By comparing corresponding triangles, we produced a distance score to identify similar patterns. We generated those triangulated images (TIs) for our entire data set at all developmental stages and demonstrated that this representation can be used as for objective computationally defined description for expression in in situ hybridization images from various sources, including images from the literature.
We used the TIs to conduct a comprehensive analysis of the expression landscape. To this end, we created a novel approach to temporally sort and compact TIs to a non-redundant data set suitable for further computational processing. Although generally applicable for all developmental stages, for this study, we focused on developmental stages 4–6. For this stage range, we reduced the initial set of about 5800 TIs to 553 TIs containing 364 genes. Using this filtered data set, to discover how expression subdivides the embryo into regions, we clustered areas with similar expression and demonstrated that expression patterns divide the early embryo into distinct spatial regions resembling a fate map (Figure 3). To discover the range of unique expression patterns, we used affinity propagation clustering (Frey and Dueck, 2007) to group TIs with similar patterns and identified 39 clusters each representing a distinct pattern class. We integrated the remaining genes into the 39 clusters and studied the distribution of expression patterns and the relationships between the clusters.
The clustered expression patterns were used to identify putative positive and negative regulatory interactions. The similar TIs in each cluster not only grouped already known genes with related functions, but previously undescribed genes. A comparative analysis identified subtle differences between the genes within each expression cluster. To investigate these differences, we developed a novel Markov Random Field (MRF) segmentation algorithm to extract patterns. We then extended the MRF algorithm to detect shared expression boundaries, generate similarity measurements, and discriminate even faint/uncertain patterns between two TIs. This enabled us to identify more subtle partial expression pattern overlaps and adjacent non-overlapping patterns. For example, by conducting this analysis on the cluster containing the gene snail, we identified the previously known huckebein, which restricts snail expression (Reuter and Leptin, 1994), and zfh1, which interacts with tinman (Broihier et al, 1998; Su et al, 1999).
By studying the functions of known genes, we assigned putative developmental roles to each of the 39 clusters. Of the 1800 genes investigated, only half of them had previously assigned functions.
Representing expression patterns with geometric meshes facilitates the analysis of a complex process involving thousands of genes. This approach is complementary to the cellular resolution 3D atlas for the Drosophila embryo (Fowlkes et al, 2008). Our method can be used as a rapid, fully automated, high-throughput approach to obtain a map of co-expression, which will serve to select specific genes for detailed multiplex in-situ hybridization and confocal analysis for a fine-grain atlas. Our data are similar to the data in the literature, and research groups studying reporter constructs, mutant animals, or orthologs can easily produce in situ hybridizations. TIs can be readily created and provide representations that are both comparable to each other and our data set. We have demonstrated that our approach can be used for predicting relationships in regulatory and developmental pathways.
Discovery of temporal and spatial patterns of gene expression is essential for understanding the regulatory networks and development in multicellular organisms. We analyzed the images from our large-scale spatial expression data set of early Drosophila embryonic development and present a comprehensive computational image analysis of the expression landscape. For this study, we created an innovative virtual representation of embryonic expression patterns using an elliptically shaped mesh grid that allows us to make quantitative comparisons of gene expression using a common frame of reference. Demonstrating the power of our approach, we used gene co-expression to identify distinct expression domains in the early embryo; the result is surprisingly similar to the fate map determined using laser ablation. We also used a clustering strategy to find genes with similar patterns and developed new analysis tools to detect variation within consensus patterns, adjacent non-overlapping patterns, and anti-correlated patterns. Of the 1800 genes investigated, only half had previously assigned functions. The known genes suggest developmental roles for the clusters, and identification of related patterns predicts requirements for co-occurring biological functions.
PMCID: PMC2824522  PMID: 20087342
biological function; embryo; gene expression; in situ hybridization; Markov Random Field
5.  The transposable elements of the Drosophila melanogaster euchromatin: a genomics perspective 
Genome Biology  2002;3(12):research0084.1-84.2.
Using Release 3 of the euchromatic genomic sequence of Drosophila melanogaster, 85 known and eight novel families of transposable element have been identified, varying in copy number from one to 146. A total of 1,572 full and partial transposable elements were identified, comprising 3.86% of the sequence.
Transposable elements are found in the genomes of nearly all eukaryotes. The recent completion of the Release 3 euchromatic genomic sequence of Drosophila melanogaster by the Berkeley Drosophila Genome Project has provided precise sequence for the repetitive elements in the Drosophila euchromatin. We have used this genomic sequence to describe the euchromatic transposable elements in the sequenced strain of this species.
We identified 85 known and eight novel families of transposable element varying in copy number from one to 146. A total of 1,572 full and partial transposable elements were identified, comprising 3.86% of the sequence. More than two-thirds of the transposable elements are partial. The density of transposable elements increases an average of 4.7 times in the centromere-proximal regions of each of the major chromosome arms. We found that transposable elements are preferentially found outside genes; only 436 of 1,572 transposable elements are contained within the 61.4 Mb of sequence that is annotated as being transcribed. A large proportion of transposable elements is found nested within other elements of the same or different classes. Lastly, an analysis of structural variation from different families reveals distinct patterns of deletion for elements belonging to different classes.
This analysis represents an initial characterization of the transposable elements in the Release 3 euchromatic genomic sequence of D. melanogaster for which comparison to the transposable elements of other organisms can begin to be made. These data have been made available on the Berkeley Drosophila Genome Project website for future analyses.
PMCID: PMC151186  PMID: 12537573
6.  Finishing a whole-genome shotgun: Release 3 of the Drosophila melanogaster euchromatic genome sequence 
Genome Biology  2002;3(12):research0079.1-79.14.
The Drosophila melanogaster genome was the first metazoan genome to be sequenced by whole-genome shotgun. Now, the sequence has been finished in a process designed to close gaps, improve sequence quality and validate the assembly.
The Drosophila melanogaster genome was the first metazoan genome to have been sequenced by the whole-genome shotgun (WGS) method. Two issues relating to this achievement were widely debated in the genomics community: how correct is the sequence with respect to base-pair (bp) accuracy and frequency of assembly errors? And, how difficult is it to bring a WGS sequence to the accepted standard for finished sequence? We are now in a position to answer these questions.
Our finishing process was designed to close gaps, improve sequence quality and validate the assembly. Sequence traces derived from the WGS and draft sequencing of individual bacterial artificial chromosomes (BACs) were assembled into BAC-sized segments. These segments were brought to high quality, and then joined to constitute the sequence of each chromosome arm. Overall assembly was verified by comparison to a physical map of fingerprinted BAC clones. In the current version of the 116.9 Mb euchromatic genome, called Release 3, the six euchromatic chromosome arms are represented by 13 scaffolds with a total of 37 sequence gaps. We compared Release 3 to Release 2; in autosomal regions of unique sequence, the error rate of Release 2 was one in 20,000 bp.
The WGS strategy can efficiently produce a high-quality sequence of a metazoan genome while generating the reagents required for sequence finishing. However, the initial method of repeat assembly was flawed. The sequence we report here, Release 3, is a reliable resource for molecular genetic experimentation and computational analysis.
PMCID: PMC151181  PMID: 12537568

Results 1-6 (6)