Search tips
Search criteria

Results 1-10 (10)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
author:("golub, Jeremy")
1.  MicroRNA polymorphisms and risk of colorectal cancer 
MicroRNAs (miRNAs) act as post-transcriptional regulators of gene expression. Genetic variation in miRNA-encoding sequences or their corresponding binding sites may affect the fidelity of the miRNA-messenger RNA interaction and subsequently alter risk of cancer development.
This study expanded the search for miRNA-related polymorphisms contributing to the etiology of colorectal cancer (CRC) across the genome using a novel platform, the Axiom® miRNA Target Site Genotyping Array (237,858 markers). After quality control, the study included 596 cases and 429 controls from the Molecular Epidemiology of Colorectal Cancer study, a population-based case-control study of CRC in northern Israel. The association between each marker and CRC status was examined assuming a log-additive genetic model using logistic regression adjusted for sex, age, and two principal components.
Twenty-three markers had p-values less than 5.0E-04, and the most statistically significant association involved rs2985 (chr6:34845648; intronic of UHRF1BP1; OR=0.66; p-value=3.7E-05). Further, this study replicated a previously published locus, rs1051690 in the 3’-untranslated region of the insulin receptor gene INSR (OR = 1.38; p = 0.03), with strong evidence of differences in INSR gene expression by genotype.
This study is the first to examine associations between genetic variation in miRNA target sites and CRC using a genome-wide approach. Functional studies to identify allele-specific effects on miRNA binding are needed to confirm the regulatory capacity of genetic variation to influence risk of CRC.
This study demonstrates the potential for a miRNA-targeted genome-wide association study to identify candidate susceptibility loci and prioritize them for functional characterization.
PMCID: PMC4867219  PMID: 25342389
miRNA; genome-wide association study; CRC; GWAS; susceptibility
2.  Design and coverage of high throughput genotyping arrays optimized for individuals of East Asian, African American, and Latino race/ethnicity using imputation and a novel hybrid SNP selection algorithm 
Genomics  2011;98(6):422-430.
Four custom Axiom genotyping arrays were designed for a genome-wide association (GWA) study of 100,000 participants from the Kaiser Permanente Research Program on Genes, Environment and Health. The array optimized for individuals of European race/ethnicity was previously described. Here we detail the development of three additional microarrays optimized for individuals of East Asian, African American, and Latino race/ethnicity. For these arrays, we decreased redundancy of high-performing SNPs to increase SNP capacity. The East Asian array was designed using greedy pairwise SNP selection. However, removing SNPs from the target set based on imputation coverage is more efficient than pairwise tagging. Therefore, we developed a novel hybrid SNP selection method for the African American and Latino arrays utilizing rounds of greedy pairwise SNP selection, followed by removal from the target set of SNPs covered by imputation. The arrays provide excellent genome-wide coverage and are valuable additions for large-scale GWA studies.
PMCID: PMC3502750  PMID: 21903159
Microarray; Genome-wide association study; Coverage; Imputation; Single nucleotide polymorphism; Throughput
3.  Next generation genome-wide association tool: Design and coverage of a high-throughput European-optimized SNP array 
Genomics  2011;98(2):79-89.
The success of genome-wide association studies has paralleled the development of efficient genotyping technologies. We describe the development of a next-generation microarray based on the new highly-efficient Affymetrix Axiom genotyping technology that we are using to genotype individuals of European ancestry from the Kaiser Permanente Research Program on Genes, Environment and Health (RPGEH). The array contains 674,517 SNPs, and provides excellent genome-wide as well as gene-based and candidate-SNP coverage. Coverage was calculated using an approach based on imputation and cross validation. Preliminary results for the first 80,301 saliva-derived DNA samples from the RPGEH demonstrate very high quality genotypes, with sample success rates above 94% and over 98% of successful samples having SNP call rates exceeding 98%. At steady state, we have produced 462 million genotypes per week for each Axiom system. The new array provides a valuable addition to the repertoire of tools for large scale genome-wide association studies.
PMCID: PMC3146553  PMID: 21565264
Microarray; Genome-wide association study; Coverage; Throughput; Single nucleotide polymorphism
4.  Genome-Wide Maps of Circulating miRNA Biomarkers for Ulcerative Colitis 
PLoS ONE  2012;7(2):e31241.
Inflammatory Bowel Disease – comprised of Crohn's Disease and Ulcerative Colitis (UC) - is a complex, multi-factorial inflammatory disorder of the gastrointestinal tract. In this study we have explored the utility of naturally occurring circulating miRNAs as potential blood-based biomarkers for non-invasive prediction of UC incidences. Whole genome maps of circulating miRNAs in micro-vesicles, Peripheral Blood Mononuclear Cells and platelets have been constructed from a cohort of 20 UC patients and 20 normal individuals. Through Significance Analysis of Microarrays, a signature of 31 differentially expressed platelet-derived miRNAs has been identified and biomarker performance estimated through a non-probabilistic binary linear classification using Support Vector Machines. Through this approach, classifier measurements reveal a predictive score of 92.8% accuracy, 96.2% specificity and 89.5% sensitivity in distinguishing UC patients from normal individuals. Additionally, the platelet-derived biomarker signature can be validated at 88% accuracy through qPCR assays, and a majority of the miRNAs in this panel can be demonstrated to sub-stratify into 4 highly correlated intensity based clusters. Analysis of predicted targets of these biomarkers reveal an enrichment of pathways associated with cytoskeleton assembly, transport, membrane permeability and regulation of transcription factors engaged in a variety of regulatory cascades that are consistent with a cell-mediated immune response model of intestinal inflammation. Interestingly, comparison of the miRNA biomarker panel and genetic loci implicated in IBD through genome-wide association studies identifies a physical linkage between hsa-miR-941 and a UC susceptibility loci located on Chr 20. Taken together, analysis of these expression maps outlines a promising catalog of novel platelet-derived miRNA biomarkers of clinical utility and provides insight into the potential biological function of these candidates in disease pathogenesis.
PMCID: PMC3281076  PMID: 22359580
5.  Impact of Cellular miRNAs on Circulating miRNA Biomarker Signatures 
PLoS ONE  2011;6(6):e20769.
Effective diagnosis and surveillance of complex multi-factorial disorders such as cancer can be improved by screening of easily accessible biomarkers. Highly stable cell free Circulating Nucleic Acids (CNA) present as both RNA and DNA species have been discovered in the blood and plasma of humans. Correlations between tumor-associated genomic/epigenetic/transcriptional changes and alterations in CNA levels are strong predictors of the utility of this biomarker class as promising clinical indicators. Towards this goal microRNAs (miRNAs) representing a class of naturally occurring small non-coding RNAs of 19–25 nt in length have emerged as an important set of markers that can associate their specific expression profiles with cancer development. In this study we investigate some of the pre-analytic considerations for isolating plasma fractions for the study of miRNA biomarkers. We find that measurement of circulating miRNA levels are frequently confounded by varying levels of cellular miRNAs of different hematopoietic origins. In order to assess the relative proportions of this cell-derived class, we have fractionated whole blood into plasma and its ensuing sub-fractions. Cellular miRNA signatures in cohorts of normal individuals are catalogued and the abundance and gender specific expression of bona fide circulating markers explored after calibrating the signal for this interfering class. A map of differentially expressed profiles is presented and the intrinsic variability of circulating miRNA species investigated in subsets of healthy males and females.
PMCID: PMC3117799  PMID: 21698099
6.  GO::TermFinder—open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes 
Bioinformatics (Oxford, England)  2004;20(18):3710-3715.
GO::TermFinder comprises a set of object-oriented Perl modules for accessing Gene Ontology (GO) information and evaluating and visualizing the collective annotation of a list of genes to GO terms. It can be used to draw conclusions from microarray and other biological data, calculating the statistical significance of each annotation. GO::TermFinder can be used on any system on which Perl can be run, either as a command line application, in single or batch mode, or as a web-based CGI script.
The full source code and documentation for GO::TermFinder are freely available from
PMCID: PMC3037731  PMID: 15297299
7.  The liver pharmacological and xenobiotic gene response repertoire 
We have used a supervised classification approach to systematically mine a large microarray database derived from livers of compound-treated rats. Thirty-four distinct signatures (classifiers) for pharmacological and toxicological end points can be identified. Just 200 genes are sufficient to classify these end points. Signatures were enriched in xenobiotic and immune response genes and contain un-annotated genes, indicating that not all key genes in the liver xenobiotic responses have been characterized. Many signatures with equal classification capabilities but with no gene in common can be derived for the same phenotypic end point. The analysis of the union of all genes present in these signatures can reveal the underlying biology of that end point as illustrated here using liver fibrosis signatures. Our approach using the whole genome and a diverse set of compounds allows a comprehensive view of most pharmacological and toxicological questions and is applicable to other situations such as disease and development.
PMCID: PMC2290941  PMID: 18364709
biomarker; data mining; liver; toxicity; toxicology; xenobiotic
8.  The Stanford Microarray Database: implementation of new analysis tools and open source release of software 
Nucleic Acids Research  2006;35(Database issue):D766-D770.
The Stanford Microarray Database (SMD; ) is a research tool and archive that allows hundreds of researchers worldwide to store, annotate, analyze and share data generated by microarray technology. SMD supports most major microarray platforms, and is MIAME-supportive and can export or import MAGE-ML. The primary mission of SMD is to be a research tool that supports researchers from the point of data generation to data publication and dissemination, but it also provides unrestricted access to analysis tools and public data from 300 publications. In addition to supporting ongoing research, SMD makes its source code fully and freely available to others under an Open Source license, enabling other groups to create a local installation of SMD. In this article, we describe several data analysis tools implemented in SMD and we discuss features of our software release.
PMCID: PMC1781111  PMID: 17182626
9.  The Stanford Microarray Database accommodates additional microarray platforms and data formats 
Nucleic Acids Research  2004;33(Database Issue):D580-D582.
The Stanford Microarray Database (SMD) ( is a research tool for hundreds of Stanford researchers and their collaborators. In addition, SMD functions as a resource for the entire biological research community by providing unrestricted access to microarray data published by SMD users and by disseminating its source code. In addition to storing GenePix (Axon Instruments) and ScanAlyze output from spotted microarrays, SMD has recently added the ability to store, retrieve, display and analyze the complete raw data produced by several additional microarray platforms and image analysis software packages, so that we can also now accept data from Affymetrix GeneChips (MAS5/GCOS or dChip), Agilent Catalog or Custom arrays (using Agilent's Feature Extraction software) or data created by SpotReader (Niles Scientific). We have implemented software that allows us to accept MAGE-ML documents from array manufacturers and to submit MIAME-compliant data in MAGE-ML format directly to ArrayExpress and GEO, greatly increasing the ease with which data from SMD can be published adhering to accepted standards and also increasing the accessibility of published microarray data to the general public. We have introduced a new tool to facilitate data sharing among our users, so that datasets can be shared during, before or after the completion of data analysis. The latest version of the source code for the complete database package was released in November 2004 (, allowing researchers around the world to deploy their own installations of SMD.
PMCID: PMC539960  PMID: 15608265
10.  The Stanford Microarray Database: data access and quality assessment tools 
Nucleic Acids Research  2003;31(1):94-96.
The Stanford Microarray Database (SMD; serves as a microarray research database for Stanford investigators and their collaborators. In addition, SMD functions as a resource for the entire scientific community, by making freely available all of its source code and providing full public access to data published by SMD users, along with many tools to explore and analyze those data. SMD currently provides public access to data from 3500 microarrays, including data from 85 publications, and this total is increasing rapidly. In this article, we describe some of SMD's newer tools for accessing public data, assessing data quality and for data analysis.
PMCID: PMC165525  PMID: 12519956

Results 1-10 (10)