Search tips
Search criteria

Results 1-6 (6)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Open-access synthetic spike-in mRNA-seq data for cancer gene fusions 
BMC Genomics  2014;15(1):824.
Oncogenic fusion genes underlie the mechanism of several common cancers. Next-generation sequencing based RNA-seq analyses have revealed an increasing number of recurrent fusions in a variety of cancers. However, absence of a publicly available gene-fusion focused RNA-seq data impedes comparative assessment and collaborative development of novel gene fusions detection algorithms. We have generated nine synthetic poly-adenylated RNA transcripts that correspond to previously reported oncogenic gene fusions. These synthetic RNAs were spiked at known molarity over a wide range into total RNA prior to construction of next-generation sequencing mRNA libraries to generate RNA-seq data.
Leveraging a priori knowledge about replicates and molarity of each synthetic fusion transcript, we demonstrate utility of this dataset to compare multiple gene fusion algorithms’ detection ability. In general, more fusions are detected at higher molarity, indicating that our constructs performed as expected. However, systematic detection differences are observed based on molarity or algorithm-specific characteristics. Fusion-sequence specific detection differences indicate that for applications where specific sequences are being investigated, additional constructs may be added to provide quantitative data that is specific for the sequence of interest.
To our knowledge, this is the first publicly available synthetic RNA-seq data that specifically leverages known cancer gene-fusions. The proposed method of designing multiple gene-fusion constructs over a wide range of molarity allows granular performance analyses of multiple fusion-detection algorithms. The community can leverage and augment this publicly available data to further collaborative development of analytical tools and performance assessment frameworks for gene fusions from next-generation sequencing data.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-824) contains supplementary material, which is available to authorized users.
PMCID: PMC4190330  PMID: 25266161
RNA-seq; Gene fusions; Cancer genomics
2.  Survey of Culture, GoldenGate Assay, Universal Biosensor Assay, and 16S rRNA Gene Sequencing as Alternative Methods of Bacterial Pathogen Detection 
Journal of Clinical Microbiology  2013;51(10):3263-3269.
Cultivation-based assays combined with PCR or enzyme-linked immunosorbent assay (ELISA)-based methods for finding virulence factors are standard methods for detecting bacterial pathogens in stools; however, with emerging molecular technologies, new methods have become available. The aim of this study was to compare four distinct detection technologies for the identification of pathogens in stools from children under 5 years of age in The Gambia, Mali, Kenya, and Bangladesh. The children were identified, using currently accepted clinical protocols, as either controls or cases with moderate to severe diarrhea. A total of 3,610 stool samples were tested by established clinical culture techniques: 3,179 DNA samples by the Universal Biosensor assay (Ibis Biosciences, Inc.), 1,466 DNA samples by the GoldenGate assay (Illumina), and 1,006 DNA samples by sequencing of 16S rRNA genes. Each method detected different proportions of samples testing positive for each of seven enteric pathogens, enteroaggregative Escherichia coli (EAEC), enterotoxigenic E. coli (ETEC), enteropathogenic E. coli (EPEC), Shigella spp., Campylobacter jejuni, Salmonella enterica, and Aeromonas spp. The comparisons among detection methods included the frequency of positive stool samples and kappa values for making pairwise comparisons. Overall, the standard culture methods detected Shigella spp., EPEC, ETEC, and EAEC in smaller proportions of the samples than either of the methods based on detection of the virulence genes from DNA in whole stools. The GoldenGate method revealed the greatest agreement with the other methods. The agreement among methods was higher in cases than in controls. The new molecular technologies have a high potential for highly sensitive identification of bacterial diarrheal pathogens.
PMCID: PMC3811648  PMID: 23884998
3.  Expanding the Helicobacter pylori Genetic Toolbox: Modification of an Endogenous Plasmid for Use as a Transcriptional Reporter and Complementation Vector▿  
Applied and Environmental Microbiology  2007;73(23):7506-7514.
Helicobacter pylori is an important human pathogen. However, the study of this organism is often limited by a relative shortage of genetic tools. In an effort to expand the methods available for genetic study, an endogenous H. pylori plasmid was modified for use as a transcriptional reporter and as a complementation vector. This was accomplished by addition of an Escherichia coli origin of replication, a kanamycin resistance cassette, a promoterless gfpmut3 gene, and a functional multiple cloning site to form pTM117. The promoters of amiE and pfr, two well-characterized Fur-regulated promoters, were fused to the promoterless gfpmut3, and green fluorescent protein (GFP) expression of the fusions in wild-type and Δfur strains was analyzed by flow cytometry under iron-replete and iron-depleted conditions. GFP expression was altered as expected based on current knowledge of Fur regulation of these promoters. RNase protection assays were used to determine the ability of this plasmid to serve as a complementation vector by analyzing amiE, pfr, and fur expression in wild-type and Δfur strains carrying a wild-type copy of fur on the plasmid. Proper regulation of these genes was restored in the Δfur background under high- and low-iron conditions, signifying complementation of both iron-bound and apo Fur regulation. These studies show the potential of pTM117 as a molecular tool for genetic analysis of H. pylori.
PMCID: PMC2168067  PMID: 17921278
4.  cis sequence effects on gene expression 
BMC Genomics  2007;8:296.
Sequence and transcriptional variability within and between individuals are typically studied independently. The joint analysis of sequence and gene expression variation (genetical genomics) provides insight into the role of linked sequence variation in the regulation of gene expression. We investigated the role of sequence variation in cis on gene expression (cis sequence effects) in a group of genes commonly studied in cancer research in lymphoblastoid cell lines. We estimated the proportion of genes exhibiting cis sequence effects and the proportion of gene expression variation explained by cis sequence effects using three different analytical approaches, and compared our results to the literature.
We generated gene expression profiling data at N = 697 candidate genes from N = 30 lymphoblastoid cell lines for this study and used available candidate gene resequencing data at N = 552 candidate genes to identify N = 30 candidate genes with sufficient variance in both datasets for the investigation of cis sequence effects. We used two additive models and the haplotype phylogeny scanning approach of Templeton (Tree Scanning) to evaluate association between individual SNPs, all SNPs at a gene, and diplotypes, with log-transformed gene expression. SNPs and diplotypes at eight candidate genes exhibited statistically significant (p < 0.05) association with gene expression. Using the literature as a "gold standard" to compare 14 genes with data from both this study and the literature, we observed 80% and 85% concordance for genes exhibiting and not exhibiting significant cis sequence effects in our study, respectively.
Based on analysis of our results and the extant literature, one in four genes exhibits significant cis sequence effects, and for these genes, about 30% of gene expression variation is accounted for by cis sequence variation. Despite diverse experimental approaches, the presence or absence of significant cis sequence effects is largely supported by previously published studies.
PMCID: PMC2077339  PMID: 17727713
5.  Genome wide profiling of human embryonic stem cells (hESCs), their derivatives and embryonal carcinoma cells to develop base profiles of U.S. Federal government approved hESC lines 
In order to compare the gene expression profiles of human embryonic stem cell (hESC) lines and their differentiated progeny and to monitor feeder contaminations, we have examined gene expression in seven hESC lines and human fibroblast feeder cells using Illumina® bead arrays that contain probes for 24,131 transcript probes.
A total of 48 different samples (including duplicates) grown in multiple laboratories under different conditions were analyzed and pairwise comparisons were performed in all groups. Hierarchical clustering showed that blinded duplicates were correctly identified as the closest related samples. hESC lines clustered together irrespective of the laboratory in which they were maintained. hESCs could be readily distinguished from embryoid bodies (EB) differentiated from them and the karyotypically abnormal hESC line BG01V. The embryonal carcinoma (EC) line NTera2 is a useful model for evaluating characteristics of hESCs. Expression of subsets of individual genes was validated by comparing with published databases, MPSS (Massively Parallel Signature Sequencing) libraries, and parallel analysis by microarray and RT-PCR.
we show that Illumina's bead array platform is a reliable, reproducible and robust method for developing base global profiles of cells and identifying similarities and differences in large number of samples.
PMCID: PMC1523200  PMID: 16672070
6.  The Locus of Enterocyte Effacement (LEE)-Encoded Regulator Controls Expression of Both LEE- and Non-LEE-Encoded Virulence Factors in Enteropathogenic and Enterohemorrhagic Escherichia coli 
Infection and Immunity  2000;68(11):6115-6126.
Regulation of virulence gene expression in enteropathogenic Escherichia coli (EPEC) and enterohemorrhagic E. coli (EHEC) is incompletely understood. In EPEC, the plasmid-encoded regulator Per is required for maximal expression of proteins encoded on the locus of enterocyte effacement (LEE), and a LEE-encoded regulator (Ler) is part of the Per-mediated regulatory cascade upregulating the LEE2, LEE3, and LEE4 promoters. We now report that Ler is essential for the expression of multiple LEE-located genes in both EPEC and EHEC, including those encoding the type III secretion pathway, the secreted Esp proteins, Tir, and intimin. Ler is therefore central to the process of attaching and effacing (AE) lesion formation. Ler also regulates the expression of LEE-located genes not required for AE-lesion formation, including rorf2, orf10, rorf10, orf19, and espF, indicating that Ler regulates additional virulence properties. In addition, Ler regulates the expression of proteins encoded outside the LEE that are not essential for AE lesion formation, including TagA in EHEC and EspC in EPEC. Δler mutants of both EPEC and EHEC show altered adherence to epithelial cells and express novel fimbriae. Ler is therefore a global regulator of virulence gene expression in EPEC and EHEC.
PMCID: PMC97688  PMID: 11035714

Results 1-6 (6)