Search tips
Search criteria

Results 1-4 (4)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
1.  Quantification and Analysis of Large Multimodal Clinical Image Studies: Application to Stroke 
We present an analysis framework for large studies of multimodal clinical quality brain image collections. Processing and analysis of such datasets is challenging due to low resolution, poor contrast, mis-aligned images, and restricted field of view. We adapt existing registration and segmentation methods and build a computational pipeline for spatial normalization and feature extraction. The resulting aligned dataset enables clinically meaningful analysis of spatial distributions of relevant anatomical features and of their evolution with age and disease progression. We demonstrate the approach on a neuroimaging study of stroke with more than 800 patients. We show that by combining data from several modalities, we can automatically segment important biomarkers such as white matter hyperintensity and characterize pathology evolution in this heterogeneous cohort. Specifically, we examine two sub-populations with different dynamics of white matter hyperintensity changes as a function of patients’ age. Pipeline and analysis code is available at
PMCID: PMC4306599  PMID: 25632408
2.  Joint Modeling of Imaging and Genetics 
We propose a unified Bayesian framework for detecting genetic variants associated with a disease while exploiting image-based features as an intermediate phenotype. Traditionally, imaging genetics methods comprise two separate steps. First, image features are selected based on their relevance to the disease phenotype. Second, a set of genetic variants are identified to explain the selected features. In contrast, our method performs these tasks simultaneously to ultimately assign probabilistic measures of relevance to both genetic and imaging markers. We derive an efficient approximate inference algorithm that handles high dimensionality of imaging genetic data. We evaluate the algorithm on synthetic data and show that it outperforms traditional models. We also illustrate the application of the method on ADNI data.
PMCID: PMC3979537  PMID: 24684016
Imaging Genetics; Bayesian Models; Variational Inference; Probabilistic Graphical Model
3.  VARiD: A variation detection framework for color-space and letter-space platforms 
Bioinformatics  2010;26(12):i343-i349.
Motivation: High-throughput sequencing (HTS) technologies are transforming the study of genomic variation. The various HTS technologies have different sequencing biases and error rates, and while most HTS technologies sequence the residues of the genome directly, generating base calls for each position, the Applied Biosystem's SOLiD platform generates dibase-coded (color space) sequences. While combining data from the various platforms should increase the accuracy of variation detection, to date there are only a few tools that can identify variants from color space data, and none that can analyze color space and regular (letter space) data together.
Results: We present VARiD—a probabilistic method for variation detection from both letter- and color-space reads simultaneously. VARiD is based on a hidden Markov model and uses the forward-backward algorithm to accurately identify heterozygous, homozygous and tri-allelic SNPs, as well as micro-indels. Our analysis shows that VARiD performs better than the AB SOLiD toolset at detecting variants from color-space data alone, and improves the calls dramatically when letter- and color-space reads are combined.
Availability: The toolset is freely available at
PMCID: PMC2881369  PMID: 20529926
4.  SHRiMP: Accurate Mapping of Short Color-space Reads 
PLoS Computational Biology  2009;5(5):e1000386.
The development of Next Generation Sequencing technologies, capable of sequencing hundreds of millions of short reads (25–70 bp each) in a single run, is opening the door to population genomic studies of non-model species. In this paper we present SHRiMP - the SHort Read Mapping Package: a set of algorithms and methods to map short reads to a genome, even in the presence of a large amount of polymorphism. Our method is based upon a fast read mapping technique, separate thorough alignment methods for regular letter-space as well as AB SOLiD (color-space) reads, and a statistical model for false positive hits. We use SHRiMP to map reads from a newly sequenced Ciona savignyi individual to the reference genome. We demonstrate that SHRiMP can accurately map reads to this highly polymorphic genome, while confirming high heterozygosity of C. savignyi in this second individual. SHRiMP is freely available at
Author Summary
Next Generation Sequencing (NGS) technologies are revolutionizing the way biologists acquire and analyze genomic data. NGS machines, such as Illumina/Solexa and AB SOLiD, are able to sequence genomes more cheaply by 200-fold than previous methods. One of the main application areas of NGS technologies is the discovery of genomic variation within a given species. The first step in discovering this variation is the mapping of reads sequenced from a donor individual to a known (“reference”) genome. Differences between the reference and the reads are indicative either of polymorphisms, or of sequencing errors. Since the introduction of NGS technologies, many methods have been devised for mapping reads to reference genomes. However, these algorithms often sacrifice sensitivity for fast running time. While they are successful at mapping reads from organisms that exhibit low polymorphism rates, they do not perform well at mapping reads from highly polymorphic organisms. We present a novel read mapping method, SHRiMP, that can handle much greater amounts of polymorphism. Using Ciona savignyi as our target organism, we demonstrate that our method discovers significantly more variation than other methods. Additionally, we develop color-space extensions to classical alignment algorithms, allowing us to map color-space, or “dibase”, reads generated by AB SOLiD sequencers.
PMCID: PMC2678294  PMID: 19461883

Results 1-4 (4)