1.  Small tRNA-derived RNAs are increased and more abundant than microRNAs in chronic hepatitis B and C 
Scientific Reports  2015;5:7675.
Persistent infections with hepatitis B virus (HBV) or hepatitis C virus (HCV) account for the majority of cases of hepatic cirrhosis and hepatocellular carcinoma (HCC) worldwide. Small, non-coding RNAs play important roles in virus-host interactions. We used high throughput sequencing to conduct an unbiased profiling of small (14-40 nts) RNAs in liver from Japanese subjects with advanced hepatitis B or C and hepatocellular carcinoma (HCC). Small RNAs derived from tRNAs, specifically 30–35 nucleotide-long 5′ tRNA-halves (5′ tRHs), were abundant in non-malignant liver and significantly increased in humans and chimpanzees with chronic viral hepatitis. 5′ tRH abundance exceeded microRNA abundance in most infected non-cancerous tissues. In contrast, in matched cancer tissue, 5′ tRH abundance was reduced, and relative abundance of individual 5′ tRHs was altered. In hepatitis B-associated HCC, 5′ tRH abundance correlated with expression of the tRNA-cleaving ribonuclease, angiogenin. These results demonstrate that tRHs are the most abundant small RNAs in chronically infected liver and that their abundance is altered in liver cancer.
PMCID: PMC4286764  PMID: 25567797
2.  Prospective Associations of Coronary Heart Disease Loci in African Americans Using the MetaboChip: The PAGE Study 
PLoS ONE  2014;9(12):e113203.
Coronary heart disease (CHD) is a leading cause of morbidity and mortality in African Americans. However, there is a paucity of studies assessing genetic determinants of CHD in African Americans. We examined the association of published variants in CHD loci with incident CHD, attempted to fine map these loci, and characterize novel variants influencing CHD risk in African Americans.
Methods and Results
Up to 8,201 African Americans (including 546 first CHD events) were genotyped using the MetaboChip array in the Atherosclerosis Risk in Communities (ARIC) study and Women's Health Initiative (WHI). We tested associations using Cox proportional hazard models in sex- and study-stratified analyses and combined results using meta-analysis. Among 44 validated CHD loci available in the array, we replicated and fine-mapped the SORT1 locus, and showed same direction of effects as reported in studies of individuals of European ancestry for SNPs in 22 additional published loci. We also identified a SNP achieving array wide significance (MYC: rs2070583, allele frequency 0.02, P = 8.1×10−8), but the association did not replicate in an additional 8,059 African Americans (577 events) from the WHI, HealthABC and GeneSTAR studies, and in a meta-analysis of 5 cohort studies of European ancestry (24,024 individuals including 1,570 cases of MI and 2,406 cases of CHD) from the CHARGE Consortium.
Our findings suggest that some CHD loci previously identified in individuals of European ancestry may be relevant to incident CHD in African Americans.
PMCID: PMC4277270  PMID: 25542012
3.  HDL-transferred microRNA-223 regulates ICAM-1 expression in endothelial cells 
Nature communications  2014;5:3292.
High-density lipoproteins (HDL) have many biological functions, including reducing endothelial activation and adhesion molecule expression. We recently reported that HDL transport and deliver functional microRNAs (miRNA). Here we show that HDL suppresses expression of intercellular adhesion molecule 1 (ICAM-1) through the transfer of miR-223 to endothelial cells. After incubation of endothelial cells with HDL, mature miR-223 levels are significantly increased in endothelial cells and decreased on HDL. However, miR-223 is not transcribed in endothelial cells and is not increased in cells treated with HDL from miR-223−/− mice. HDL inhibit ICAM-1 protein levels, but not in cells pretreated with miR-223 inhibitors. ICAM-1 is a direct target of HDL-transferred miR-223 and this is the first example of an extracellular miRNA regulating gene expression in cells where it is not transcribed. Collectively, we demonstrate that HDL’s anti-inflammatory properties are conferred, in part, through HDL-miR-223 delivery and translational repression of ICAM-1 in endothelial cells.
PMCID: PMC4189962  PMID: 24576947
4.  An integrated analysis of the SOX2 microRNA response program in human pluripotent and nullipotent stem cell lines 
BMC Genomics  2014;15(1):711.
SOX2 is a core component of the transcriptional network responsible for maintaining embryonal carcinoma cells (ECCs) in a pluripotent, undifferentiated state of self-renewal. As such, SOX2 is an oncogenic transcription factor and crucial cancer stem cell (CSC) biomarker in embryonal carcinoma and, as more recently found, in the stem-like cancer cell component of many other malignancies. SOX2 is furthermore a crucial factor in the maintenance of adult stem cell phenotypes and has additional roles in cell fate determination. The SOX2-linked microRNA (miRNA) transcriptome and regulome has not yet been fully defined in human pluripotent cells or CSCs. To improve our understanding of the SOX2-linked miRNA regulatory network as a contribution to the phenotype of these cell types, we used high-throughput differential miRNA and gene expression analysis combined with existing genome-wide SOX2 chromatin immunoprecipitation (ChIP) data to map the SOX2 miRNA transcriptome in two human embryonal carcinoma cell (hECC) lines.
Whole-microRNAome and genome analysis of SOX2-silenced hECCs revealed many miRNAs regulated by SOX2, including several with highly characterised functions in both cancer and embryonic stem cell (ESC) biology. We subsequently performed genome-wide differential expression analysis and applied a Monte Carlo simulation algorithm and target prediction to identify a SOX2-linked miRNA regulome, which was strongly enriched with epithelial-to-mesenchymal transition (EMT) markers. Additionally, several deregulated miRNAs important to EMT processes had SOX2 binding sites in their promoter regions.
In ESC-like CSCs, SOX2 regulates a large miRNA network that regulates and interlinks the expression of crucial genes involved in EMT.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-711) contains supplementary material, which is available to authorized users.
PMCID: PMC4162954  PMID: 25156079
SOX2; microRNA; Embryonic stem cell; Embryonal carcinoma; Pluripotency; EMT
5.  Promoter-proximal CCCTC-factor binding is associated with an increase in the transcriptional pausing index 
Bioinformatics  2012;29(12):1485-1487.
Motivation: It has been known for more than 2 decades that after RNA polymerase II (RNAPII) initiates transcription, it can enter into a paused or stalled state immediately downstream of the transcription start site before productive elongation. Recent advances in high-throughput genomic technologies facilitated the discovery that RNAPII pausing at promoters is a widespread physiologically regulated phenomenon. The molecular underpinnings of pausing are incompletely understood. The CCCTC-factor (CTCF) is a ubiquitous nuclear factor that has diverse regulatory functions, including a recently discovered role in promoting RNAPII pausing at splice sites.
Results: In this study, we analyzed CTCF binding sites and nascent transcriptomic data from three different cell types, and found that promoter-proximal CTCF binding is significantly associated with RNAPII pausing.
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC3673211  PMID: 23047559
6.  MicroRNA-27b is a regulatory hub in lipid metabolism and is altered in dyslipidemia 
Hepatology (Baltimore, Md.)  2012;57(2):533-542.
Cellular and plasma lipid levels are tightly controlled by complex gene regulatory mechanisms. Elevated plasma lipid content, or hyperlipidemia, is a significant risk factor for cardiovascular morbidity and mortality. MicroRNAs (miRNAs) are posttranscriptional regulators of gene expression and have emerged as important modulators of lipid homeostasis, but the extent of their role has not been systematically investigated. In this study, we performed high-throughput small RNA sequencing and detected approximately 150 miRNAs in mouse liver. We then employed an unbiased, in silico strategy to identify miRNA regulatory hubs in lipid metabolism, and miR-27b was identified as the strongest such hub in human and mouse liver. In addition, hepatic miR-27b levels were determined to be sensitive to plasma hyperlipidemia, as evidenced by its ~3-fold up-regulation in the liver of mice on a high-fat diet (42% calories from fat). Further, we showed in a human hepatocyte cell line (Huh7) that miR-27b regulates the expression (mRNA and protein) of several key lipid-metabolism genes, including Angptl3 and Gpam. Finally, we demonstrated that hepatic miR-27b and its target genes are inversely altered in a mouse model of dyslipidemia and atherosclerosis.
miR-27b is responsive to lipid levels, and controls multiple genes critical to dyslipidemia.
PMCID: PMC3470747  PMID: 22777896
miRNA; lipid; triglyceride; dyslipidemia; atherosclerosis; GPAM; ANGPTL3
7.  Beta Cell 5′-Shifted isomiRs Are Candidate Regulatory Hubs in Type 2 Diabetes 
PLoS ONE  2013;8(9):e73240.
Next-generation deep sequencing of small RNAs has unveiled the complexity of the microRNA (miRNA) transcriptome, which is in large part due to the diversity of miRNA sequence variants (“isomiRs”). Changes to a miRNA’s seed sequence (nucleotides 2–8), including shifted start positions, can redirect targeting to a dramatically different set of RNAs and alter biological function. We performed deep sequencing of small RNA from mouse insulinoma (MIN6) cells (widely used as a surrogate for the study of pancreatic beta cells) and developed a bioinformatic analysis pipeline to profile isomiR diversity. Additionally, we applied the pipeline to recently published small RNA-seq data from primary human beta cells and whole islets and compared the miRNA profiles with that of MIN6. We found that: (1) the miRNA expression profile in MIN6 cells is highly correlated with those of primary human beta cells and whole islets; (2) miRNA loci can generate multiple highly expressed isomiRs with different 5′-start positions (5′-isomiRs); (3) isomiRs with shifted start positions (5′-shifted isomiRs) are highly expressed, and can be as abundant as their unshifted counterparts (5′-reference miRNAs). Finally, we identified 10 beta cell miRNA families as candidate regulatory hubs in a type 2 diabetes (T2D) gene network. The most significant candidate hub was miR-29, which we demonstrated regulates the mRNA levels of several genes critical to beta cell function and implicated in T2D. Three of the candidate miRNA hubs were novel 5′-shifted isomiRs: miR-375+1, miR-375-1 and miR-183-5p+1. We showed by in silico target prediction and in vitro transfection studies that both miR-375+1 and miR-375-1 are likely to target an overlapping, but distinct suite of beta cell genes compared to canonical miR-375. In summary, this study characterizes the isomiR profile in beta cells for the first time, and also highlights the potential functional relevance of 5′-shifted isomiRs to T2D.
PMCID: PMC3767796  PMID: 24039891
8.  Illuminating microRNA Transcription from the Epigenome 
Current Genomics  2013;14(1):68-77.
Cellular gene expression is governed by a complex, multi-faceted network of regulatory interactions. In the last decade, microRNAs (miRNAs) have emerged as critical components of this network. miRNAs are small, non-coding RNA molecules that serve as post-transcriptional regulators of gene expression. Although there has been substantive progress in our understanding of miRNA-mediated gene regulation, the mechanisms that control the expression of the miRNAs themselves are less well understood. Identifying the factors that control miRNA expression will be critical for further characterizing miRNA function in normal physiology and pathobiology. We describe recent progress in the efforts to map genomic regions that control miRNA transcription (such as promoters). In particular, we highlight the utility of large-scale “-omic” data, such as those made available by the ENCODE and the NIH Roadmap Epigenomics consortiums, for the discovery of transcriptional control elements that govern miRNA expression. Finally, we discuss how integrative analysis of complementary genetic datasets, such as the NHGRI Genome Wide Association Studies Catalog, can predict novel roles for transcriptional mis-regulation of miRNAs in complex disease etiology.
PMCID: PMC3580781  PMID: 23997652
Chromatin; complex disease; epigenome; genomics; microRNA; nascent RNA; promoter; transcription.
9.  miR-182 and miR-10a Are Key Regulators of Treg Specialisation and Stability during Schistosome and Leishmania-associated Inflammation 
PLoS Pathogens  2013;9(6):e1003451.
A diverse suite of effector immune responses provide protection against various pathogens. However, the array of effector responses must be immunologically regulated to limit pathogen- and immune-associated damage. CD4+Foxp3+ regulatory T cells (Treg) calibrate immune responses; however, how Treg cells adapt to control different effector responses is unclear. To investigate the molecular mechanism of Treg diversity we used whole genome expression profiling and next generation small RNA sequencing of Treg cells isolated from type-1 or type-2 inflamed tissue following Leishmania major or Schistosoma mansoni infection, respectively. In-silico analyses identified two miRNA “regulatory hubs” miR-10a and miR-182 as critical miRNAs in Th1- or Th2-associated Treg cells, respectively. Functionally and mechanistically, in-vitro and in-vivo systems identified that an IL-12/IFNγ axis regulated miR-10a and its putative transcription factor, Creb. Importantly, reduced miR-10a in Th1-associated Treg cells was critical for Treg function and controlled a suite of genes preventing IFNγ production. In contrast, IL-4 regulated miR-182 and cMaf in Th2-associed Treg cells, which mitigated IL-2 secretion, in part through repression of IL2-promoting genes. Together, this study indicates that CD4+Foxp3+ cells can be shaped by local environmental factors, which orchestrate distinct miRNA pathways preserving Treg stability and suppressor function.
Author Summary
The diversity of pathogens that the immune system encounters are controlled by a diverse suite of immunological effector responses. Preserving a well-controlled protective immune response is essential. Too vigorous an effector response can be as damaging as too little. Regulatory T cells (Treg) calibrate immune responses; however, how Treg cells adapt to control the diverse suite of effector responses is unclear. In this study we investigated the molecular identity of regulatory T cells that control distinct effector immune responses against two discrete pathogens, an intracellular parasitic protozoa, Leishmania major, and an extracellular helminth parasite, Schitsosoma mansoni. The two Treg populations studied were phenotypically and functionally different. We identified molecular pathways that influence this diversity and more specifically, we identified that two miRNAs (miR-182 and miR-10a) act as “regulatory hubs” critically controlling distinct properties within each Treg population. This is the first study identifying the upstream molecular pathways controlling Treg cell specialization and provides a new platform of Treg cell manipulation to fine-tune their function.
PMCID: PMC3695057  PMID: 23825948
10.  A Systematic Mapping Approach of 16q12.2/FTO and BMI in More Than 20,000 African Americans Narrows in on the Underlying Functional Variation: Results from the Population Architecture using Genomics and Epidemiology (PAGE) Study 
PLoS Genetics  2013;9(1):e1003171.
Genetic variants in intron 1 of the fat mass– and obesity-associated (FTO) gene have been consistently associated with body mass index (BMI) in Europeans. However, follow-up studies in African Americans (AA) have shown no support for some of the most consistently BMI–associated FTO index single nucleotide polymorphisms (SNPs). This is most likely explained by different race-specific linkage disequilibrium (LD) patterns and lower correlation overall in AA, which provides the opportunity to fine-map this region and narrow in on the functional variant. To comprehensively explore the 16q12.2/FTO locus and to search for second independent signals in the broader region, we fine-mapped a 646–kb region, encompassing the large FTO gene and the flanking gene RPGRIP1L by investigating a total of 3,756 variants (1,529 genotyped and 2,227 imputed variants) in 20,488 AAs across five studies. We observed associations between BMI and variants in the known FTO intron 1 locus: the SNP with the most significant p-value, rs56137030 (8.3×10−6) had not been highlighted in previous studies. While rs56137030was correlated at r2>0.5 with 103 SNPs in Europeans (including the GWAS index SNPs), this number was reduced to 28 SNPs in AA. Among rs56137030 and the 28 correlated SNPs, six were located within candidate intronic regulatory elements, including rs1421085, for which we predicted allele-specific binding affinity for the transcription factor CUX1, which has recently been implicated in the regulation of FTO. We did not find strong evidence for a second independent signal in the broader region. In summary, this large fine-mapping study in AA has substantially reduced the number of common alleles that are likely to be functional candidates of the known FTO locus. Importantly our study demonstrated that comprehensive fine-mapping in AA provides a powerful approach to narrow in on the functional candidate(s) underlying the initial GWAS findings in European populations.
Author Summary
Genetic variants within the fat mass– and obesity-associated (FTO) gene are associated with increased risk of obesity. To better understand which specific genetic variant(s) in this genetic region is associated with obesity risk, we attempt to genotype or impute all known genetic variants in the region and test for association with body mass index as a measurement of obesity in over 20,000 African Americans. We identified 29 potential candidate variants, of which one variant (rs1421085) is a particularly interesting candidate for future functional follow-up studies. Our example shows the powerful approach of studying a large African American population, substantially reducing the number of possible functional variants compared with European descent populations.
PMCID: PMC3547789  PMID: 23341774
11.  Genome-Wide Survey of Natural Selection on Functional, Structural, and Network Properties of Polymorphic Sites in Saccharomyces paradoxus 
Molecular Biology and Evolution  2011;28(9):2615-2627.
Background. To characterize the genetic basis of phenotypic evolution, numerous studies have identified individual genes that have likely evolved under natural selection. However, phenotypic changes may represent the cumulative effect of similar evolutionary forces acting on functionally related groups of genes. Phylogenetic analyses of divergent yeast species have identified functional groups of genes that have evolved at significantly different rates, suggestive of differential selection on the functional properties. However, due to environmental heterogeneity over long evolutionary timescales, selection operating within a single lineage may be dramatically different, and it is not detectable via interspecific comparisons alone. Moreover, interspecific studies typically quantify selection on protein-coding regions using the Dn/Ds ratio, which cannot be extended easily to study selection on noncoding regions or synonymous sites. The population genetic-based analysis of selection operating within a single lineage ameliorates these limitations. Findings. We investigated selection on several properties associated with genes, promoters, or polymorphic sites, by analyzing the derived allele frequency spectrum of single nucleotide polymorphisms (SNPs) in 28 strains of Saccharomyces paradoxus. We found evidence for significant differential selection between many functionally relevant categories of SNPs, underscoring the utility of function-centric approaches for discovering signatures of natural selection. When comparable, our findings are largely consistent with previous studies based on interspecific comparisons, with one notable exception: our study finds that mutations from an ancient amino acid to a relatively new amino acid are selectively disfavored, whereas interspecific comparisons have found selection against ancient amino acids. Several of our findings have not been addressed through prior interspecific studies: we find that synonymous mutations from preferred to unpreferred codons are selected against and that synonymous SNPs in the linker regions of proteins are relatively less constrained than those within protein domains. Conclusions. We present the first global survey of selection acting on various functional properties in S. paradoxus. We found that selection pressures previously detected over long evolutionary timescales have also shaped the evolution of S. paradoxus. Importantly, we also make novel discoveries untenable via conventional interspecific analyses.
PMCID: PMC3203547  PMID: 21478372
evolution; natural selection; yeast; derived allele frequency
12.  Fine-Mapping and Initial Characterization of QT Interval Loci in African Americans 
PLoS Genetics  2012;8(8):e1002870.
The QT interval (QT) is heritable and its prolongation is a risk factor for ventricular tachyarrhythmias and sudden death. Most genetic studies of QT have examined European ancestral populations; however, the increased genetic diversity in African Americans provides opportunities to narrow association signals and identify population-specific variants. We therefore evaluated 6,670 SNPs spanning eleven previously identified QT loci in 8,644 African American participants from two Population Architecture using Genomics and Epidemiology (PAGE) studies: the Atherosclerosis Risk in Communities study and Women's Health Initiative Clinical Trial. Of the fifteen known independent QT variants at the eleven previously identified loci, six were significantly associated with QT in African American populations (P≤1.20×10−4): ATP1B1, PLN1, KCNQ1, NDRG4, and two NOS1AP independent signals. We also identified three population-specific signals significantly associated with QT in African Americans (P≤1.37×10−5): one at NOS1AP and two at ATP1B1. Linkage disequilibrium (LD) patterns in African Americans assisted in narrowing the region likely to contain the functional variants for several loci. For example, African American LD patterns showed that 0 SNPs were in LD with NOS1AP signal rs12143842, compared with European LD patterns that indicated 87 SNPs, which spanned 114.2 Kb, were in LD with rs12143842. Finally, bioinformatic-based characterization of the nine African American signals pointed to functional candidates located exclusively within non-coding regions, including predicted binding sites for transcription factors such as TBX5, which has been implicated in cardiac structure and conductance. In this detailed evaluation of QT loci, we identified several African Americans SNPs that better define the association with QT and successfully narrowed intervals surrounding established loci. These results demonstrate that the same loci influence variation in QT across multiple populations, that novel signals exist in African Americans, and that the SNPs identified as strong candidates for functional evaluation implicate gene regulatory dysfunction in QT prolongation.
Author Summary
The QT interval (QT) provides a measure of a ventricular action potential, and its prolongation is associated with sudden death and ventricular arrhythmias. Genome-wide association studies performed in European populations have identified common genetic variants that influence QT. However, it is unclear whether these variants are relevant in other populations, including African Americans. The increased genetic diversity in African Americans also provides opportunities to narrow association signals and identify candidates for functional evaluation. We therefore used data from 8,644 African Americans to further characterize previously identified QT loci. Of the fifteen known independent QT variants at the eleven previously identified QT loci, six were associated with QT in African Americans. We also identified three variants that were independent from previously reported signals and narrowed intervals flanking association signals using patterns of linkage disequilibrium. Finally, bioinformatic-based characterization pointed to candidates located outside protein coding regions. Our results underscore the utility of genetic studies in African ancestral populations to identify novel variants and narrow intervals surrounding established loci. These results suggest that known QT loci are important in African Americans and that further characterization of these loci in other populations may provide additional insights into the genetic and molecular mechanisms underlying QT.
PMCID: PMC3415454  PMID: 22912591
13.  Expression determinants of mammalian argonaute proteins in mediating gene silencing 
Nucleic Acids Research  2011;40(8):3704-3713.
RNA interference occurs by two main processes: mRNA site-specific cleavage and non-cleavage-based mRNA degradation or translational repression. Site-specific cleavage is carried out by argonaute-2 (Ago2), while all four mammalian argonaute proteins (Ago1–Ago4) can carry out non-cleavage-mediated inhibition, suggesting that Ago1, Ago3 and Ago4 may have similar but potentially redundant functions. It has been observed that in mammalian tissues, expression of Ago3 and Ago4 is dramatically lower compared with Ago1; however, an optimization of the Ago3 and Ago4 coding sequences to include only the most common codon at each amino acid position was able to augment the expression of Ago3 and Ago4 to levels comparable to that of Ago1 and Ago2. Thus, we examined whether particular sequence features exist in the coding region of Ago3 and Ago4 that may prevent a high level of expression. Swapping specific sub-regions of wild-type and optimized Ago sequence identified the portion of the coding region (nucleotides 1–1163 for Ago-3 and 1–1494 for Ago-4) that is most influential for expression. This finding has implications for the evolutionary conservation of Ago proteins in the mammalian lineage and the biological role that potentially redundant Ago proteins may have.
PMCID: PMC3333847  PMID: 22210886
14.  Discovery of active enhancers through bidirectional expression of short transcripts 
Genome Biology  2011;12(11):R113.
Long-range regulatory elements, such as enhancers, exert substantial control over tissue-specific gene expression patterns. Genome-wide discovery of functional enhancers in different cell types is important for our understanding of genome function as well as human disease etiology.
In this study, we developed an in silico approach to model the previously reported phenomenon of transcriptional pausing, accompanied by divergent transcription, at active promoters. We then used this model for large-scale prediction of non-promoter-associated bidirectional expression of short transcripts. Our predictions were significantly enriched for DNase hypersensitive sites, histone H3 lysine 27 acetylation (H3K27ac), and other chromatin marks associated with active rather than poised or repressed enhancers. We also detected modest bidirectional expression at binding sites of the CCCTC-factor (CTCF) genome-wide, particularly those that overlap H3K27ac.
Our findings indicate that the signature of bidirectional expression of short transcripts, learned from promoter-proximal transcriptional pausing, can be used to predict active long-range regulatory elements genome-wide, likely due in part to specific association of RNA polymerase with enhancer regions.
PMCID: PMC3334599  PMID: 22082242
15.  Global epigenomic analysis of primary human pancreatic islets provides insights into type 2 diabetes susceptibility loci 
Cell metabolism  2010;12(5):443-455.
Identifying cis-regulatory elements is important to understand how human pancreatic islets modulate gene expression in physiologic or pathophysiologic (e.g., diabetic) conditions. We conducted genome-wide analysis of DNase I hypersensitive sites, histone H3 lysine methylation modifications (K4me1, K4me3, K79me2), and CCCTC factor (CTCF) binding in human islets. This identified ~18,000 putative promoters (several hundred unannotated and islet-active). Surprisingly, active promoter modifications were absent at genes encoding islet-specific hormones, suggesting a distinct regulatory mechanism. Of 34,039 distal (non-promoter) regulatory elements, 47% are islet-unique and 22% are CTCF-bound. In the 18 type 2 diabetes (T2D)-associated loci, we identified 118 putative regulatory elements and confirmed enhancer activity for 12/33 tested. Among 6 regulatory elements harboring T2D-associated variants, 2 exhibit significant allele-specific differences in activity. These findings present a global snapshot of the human islet epigenome and should provide functional context for non-coding variants emerging from genetic studies of T2D and other islet disorders.
PMCID: PMC3026436  PMID: 21035756
16.  Bioinformatic and Genetic Association Analysis of MicroRNA Target Sites in One-Carbon Metabolism Genes 
PLoS ONE  2011;6(7):e21851.
One-carbon metabolism (OCM) is linked to DNA synthesis and methylation, amino acid metabolism and cell proliferation. OCM dysfunction has been associated with increased risk for various diseases, including cancer and neural tube defects. MicroRNAs (miRNAs) are ∼22 nt RNA regulators that have been implicated in a wide array of basic cellular processes, such as differentiation and metabolism. Accordingly, mis-regulation of miRNA expression and/or activity can underlie complex disease etiology. We examined the possibility of OCM regulation by miRNAs. Using computational miRNA target prediction methods and Monte-Carlo based statistical analyses, we identified two candidate miRNA “master regulators” (miR-22 and miR-125) and one candidate pair of “master co-regulators” (miR-344-5p/484 and miR-488) that may influence the expression of a significant number of genes involved in OCM. Interestingly, miR-22 and miR-125 are significantly up-regulated in cells grown under low-folate conditions. In a complementary analysis, we identified 15 single nucleotide polymorphisms (SNPs) that are located within predicted miRNA target sites in OCM genes. We genotyped these 15 SNPs in a population of healthy individuals (age 18–28, n = 2,506) that was previously phenotyped for various serum metabolites related to OCM. Prior to correction for multiple testing, we detected significant associations between TCblR rs9426 and methylmalonic acid (p  =  0.045), total homocysteine levels (tHcy) (p  =  0.033), serum B12 (p < 0.0001), holo transcobalamin (p < 0.0001) and total transcobalamin (p < 0.0001); and between MTHFR rs1537514 and red blood cell folate (p < 0.0001). However, upon further genetic analysis, we determined that in each case, a linked missense SNP is the more likely causative variant. Nonetheless, our Monte-Carlo based in silico simulations suggest that miRNAs could play an important role in the regulation of OCM.
PMCID: PMC3134459  PMID: 21765920
17.  New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk 
Dupuis, Josée | Langenberg, Claudia | Prokopenko, Inga | Saxena, Richa | Soranzo, Nicole | Jackson, Anne U | Wheeler, Eleanor | Glazer, Nicole L | Bouatia-Naji, Nabila | Gloyn, Anna L | Lindgren, Cecilia M | Mägi, Reedik | Morris, Andrew P | Randall, Joshua | Johnson, Toby | Elliott, Paul | Rybin, Denis | Thorleifsson, Gudmar | Steinthorsdottir, Valgerdur | Henneman, Peter | Grallert, Harald | Dehghan, Abbas | Hottenga, Jouke Jan | Franklin, Christopher S | Navarro, Pau | Song, Kijoung | Goel, Anuj | Perry, John R B | Egan, Josephine M | Lajunen, Taina | Grarup, Niels | Sparsø, Thomas | Doney, Alex | Voight, Benjamin F | Stringham, Heather M | Li, Man | Kanoni, Stavroula | Shrader, Peter | Cavalcanti-Proença, Christine | Kumari, Meena | Qi, Lu | Timpson, Nicholas J | Gieger, Christian | Zabena, Carina | Rocheleau, Ghislain | Ingelsson, Erik | An, Ping | O’Connell, Jeffrey | Luan, Jian'an | Elliott, Amanda | McCarroll, Steven A | Payne, Felicity | Roccasecca, Rosa Maria | Pattou, François | Sethupathy, Praveen | Ardlie, Kristin | Ariyurek, Yavuz | Balkau, Beverley | Barter, Philip | Beilby, John P | Ben-Shlomo, Yoav | Benediktsson, Rafn | Bennett, Amanda J | Bergmann, Sven | Bochud, Murielle | Boerwinkle, Eric | Bonnefond, Amélie | Bonnycastle, Lori L | Borch-Johnsen, Knut | Böttcher, Yvonne | Brunner, Eric | Bumpstead, Suzannah J | Charpentier, Guillaume | Chen, Yii-Der Ida | Chines, Peter | Clarke, Robert | Coin, Lachlan J M | Cooper, Matthew N | Cornelis, Marilyn | Crawford, Gabe | Crisponi, Laura | Day, Ian N M | de Geus, Eco | Delplanque, Jerome | Dina, Christian | Erdos, Michael R | Fedson, Annette C | Fischer-Rosinsky, Antje | Forouhi, Nita G | Fox, Caroline S | Frants, Rune | Franzosi, Maria Grazia | Galan, Pilar | Goodarzi, Mark O | Graessler, Jürgen | Groves, Christopher J | Grundy, Scott | Gwilliam, Rhian | Gyllensten, Ulf | Hadjadj, Samy | Hallmans, Göran | Hammond, Naomi | Han, Xijing | Hartikainen, Anna-Liisa | Hassanali, Neelam | Hayward, Caroline | Heath, Simon C | Hercberg, Serge | Herder, Christian | Hicks, Andrew A | Hillman, David R | Hingorani, Aroon D | Hofman, Albert | Hui, Jennie | Hung, Joe | Isomaa, Bo | Johnson, Paul R V | Jørgensen, Torben | Jula, Antti | Kaakinen, Marika | Kaprio, Jaakko | Kesaniemi, Y Antero | Kivimaki, Mika | Knight, Beatrice | Koskinen, Seppo | Kovacs, Peter | Kyvik, Kirsten Ohm | Lathrop, G Mark | Lawlor, Debbie A | Le Bacquer, Olivier | Lecoeur, Cécile | Li, Yun | Lyssenko, Valeriya | Mahley, Robert | Mangino, Massimo | Manning, Alisa K | Martínez-Larrad, María Teresa | McAteer, Jarred B | McCulloch, Laura J | McPherson, Ruth | Meisinger, Christa | Melzer, David | Meyre, David | Mitchell, Braxton D | Morken, Mario A | Mukherjee, Sutapa | Naitza, Silvia | Narisu, Narisu | Neville, Matthew J | Oostra, Ben A | Orrù, Marco | Pakyz, Ruth | Palmer, Colin N A | Paolisso, Giuseppe | Pattaro, Cristian | Pearson, Daniel | Peden, John F | Pedersen, Nancy L. | Perola, Markus | Pfeiffer, Andreas F H | Pichler, Irene | Polasek, Ozren | Posthuma, Danielle | Potter, Simon C | Pouta, Anneli | Province, Michael A | Psaty, Bruce M | Rathmann, Wolfgang | Rayner, Nigel W | Rice, Kenneth | Ripatti, Samuli | Rivadeneira, Fernando | Roden, Michael | Rolandsson, Olov | Sandbaek, Annelli | Sandhu, Manjinder | Sanna, Serena | Sayer, Avan Aihie | Scheet, Paul | Scott, Laura J | Seedorf, Udo | Sharp, Stephen J | Shields, Beverley | Sigurðsson, Gunnar | Sijbrands, Erik J G | Silveira, Angela | Simpson, Laila | Singleton, Andrew | Smith, Nicholas L | Sovio, Ulla | Swift, Amy | Syddall, Holly | Syvänen, Ann-Christine | Tanaka, Toshiko | Thorand, Barbara | Tichet, Jean | Tönjes, Anke | Tuomi, Tiinamaija | Uitterlinden, André G | van Dijk, Ko Willems | van Hoek, Mandy | Varma, Dhiraj | Visvikis-Siest, Sophie | Vitart, Veronique | Vogelzangs, Nicole | Waeber, Gérard | Wagner, Peter J | Walley, Andrew | Walters, G Bragi | Ward, Kim L | Watkins, Hugh | Weedon, Michael N | Wild, Sarah H | Willemsen, Gonneke | Witteman, Jaqueline C M | Yarnell, John W G | Zeggini, Eleftheria | Zelenika, Diana | Zethelius, Björn | Zhai, Guangju | Zhao, Jing Hua | Zillikens, M Carola | Borecki, Ingrid B | Loos, Ruth J F | Meneton, Pierre | Magnusson, Patrik K E | Nathan, David M | Williams, Gordon H | Hattersley, Andrew T | Silander, Kaisa | Salomaa, Veikko | Smith, George Davey | Bornstein, Stefan R | Schwarz, Peter | Spranger, Joachim | Karpe, Fredrik | Shuldiner, Alan R | Cooper, Cyrus | Dedoussis, George V | Serrano-Ríos, Manuel | Morris, Andrew D | Lind, Lars | Palmer, Lyle J | Hu, Frank B. | Franks, Paul W | Ebrahim, Shah | Marmot, Michael | Kao, W H Linda | Pankow, James S | Sampson, Michael J | Kuusisto, Johanna | Laakso, Markku | Hansen, Torben | Pedersen, Oluf | Pramstaller, Peter Paul | Wichmann, H Erich | Illig, Thomas | Rudan, Igor | Wright, Alan F | Stumvoll, Michael | Campbell, Harry | Wilson, James F | Hamsten, Anders | Bergman, Richard N | Buchanan, Thomas A | Collins, Francis S | Mohlke, Karen L | Tuomilehto, Jaakko | Valle, Timo T | Altshuler, David | Rotter, Jerome I | Siscovick, David S | Penninx, Brenda W J H | Boomsma, Dorret | Deloukas, Panos | Spector, Timothy D | Frayling, Timothy M | Ferrucci, Luigi | Kong, Augustine | Thorsteinsdottir, Unnur | Stefansson, Kari | van Duijn, Cornelia M | Aulchenko, Yurii S | Cao, Antonio | Scuteri, Angelo | Schlessinger, David | Uda, Manuela | Ruokonen, Aimo | Jarvelin, Marjo-Riitta | Waterworth, Dawn M | Vollenweider, Peter | Peltonen, Leena | Mooser, Vincent | Abecasis, Goncalo R | Wareham, Nicholas J | Sladek, Robert | Froguel, Philippe | Watanabe, Richard M | Meigs, James B | Groop, Leif | Boehnke, Michael | McCarthy, Mark I | Florez, Jose C | Barroso, Inês
Nature genetics  2010;42(2):105-116.
Circulating glucose levels are tightly regulated. To identify novel glycemic loci, we performed meta-analyses of 21 genome-wide associations studies informative for fasting glucose (FG), fasting insulin (FI) and indices of β-cell function (HOMA-B) and insulin resistance (HOMA-IR) in up to 46,186 non-diabetic participants. Follow-up of 25 loci in up to 76,558 additional subjects identified 16 loci associated with FG/HOMA-B and two associated with FI/HOMA-IR. These include nine new FG loci (in or near ADCY5, MADD, ADRA2A, CRY2, FADS1, GLIS3, SLC2A2, PROX1 and FAM148B) and one influencing FI/HOMA-IR (near IGF1). We also demonstrated association of ADCY5, PROX1, GCK, GCKR and DGKB/TMEM195 with type 2 diabetes (T2D). Within these loci, likely biological candidate genes influence signal transduction, cell proliferation, development, glucose-sensing and circadian regulation. Our results demonstrate that genetic studies of glycemic traits can identify T2D risk loci, as well as loci that elevate FG modestly, but do not cause overt diabetes.
PMCID: PMC3018764  PMID: 20081858
18.  Redirection of Silencing Targets by Adenosine-to-Inosine Editing of miRNAs 
Science (New York, N.Y.)  2007;315(5815):1137-1140.
Primary transcripts of certain microRNA (miRNA) genes are subject to RNA editing that converts adenosine to inosine. However, the importance of miRNA editing remains largely undetermined. Here we report that tissue-specific adenosine-to-inosine editing of miR-376 cluster transcripts leads to predominant expression of edited miR-376 isoform RNAs. One highly edited site is positioned in the middle of the 5′-proximal half “seed” region critical for the hybridization of miRNAs to targets. We provide evidence that the edited miR-376 RNA silences specifically a different set of genes. Repression of phosphoribosyl pyrophosphate synthetase 1, a target of the edited miR-376 RNA and an enzyme involved in the uric-acid synthesis pathway, contributes to tight and tissue-specific regulation of uric-acid levels, revealing a previously unknown role for RNA editing in miRNA-mediated gene silencing.
PMCID: PMC2953418  PMID: 17322061
19.  Accurate microRNA target prediction correlates with protein repression levels 
BMC Bioinformatics  2009;10:295.
MicroRNAs are small endogenously expressed non-coding RNA molecules that regulate target gene expression through translation repression or messenger RNA degradation. MicroRNA regulation is performed through pairing of the microRNA to sites in the messenger RNA of protein coding genes. Since experimental identification of miRNA target genes poses difficulties, computational microRNA target prediction is one of the key means in deciphering the role of microRNAs in development and disease.
DIANA-microT 3.0 is an algorithm for microRNA target prediction which is based on several parameters calculated individually for each microRNA and combines conserved and non-conserved microRNA recognition elements into a final prediction score, which correlates with protein production fold change. Specifically, for each predicted interaction the program reports a signal to noise ratio and a precision score which can be used as an indication of the false positive rate of the prediction.
Recently, several computational target prediction programs were benchmarked based on a set of microRNA target genes identified by the pSILAC method. In this assessment DIANA-microT 3.0 was found to achieve the highest precision among the most widely used microRNA target prediction programs reaching approximately 66%. The DIANA-microT 3.0 prediction results are available online in a user friendly web server at
PMCID: PMC2752464  PMID: 19765283
20.  The database of experimentally supported targets: a functional update of TarBase 
Nucleic Acids Research  2008;37(Database issue):D155-D158.
TarBase5.0 is a database which houses a manually curated collection of experimentally supported microRNA (miRNA) targets in several animal species of central scientific interest, plants and viruses. MiRNAs are small non-coding RNA molecules that exhibit an inhibitory effect on gene expression, interfering with the stability and translational efficiency of the targeted mature messenger RNAs. Even though several computational programs exist to predict miRNA targets, there is a need for a comprehensive collection and description of miRNA targets with experimental support. Here we introduce a substantially extended version of this resource. The current version includes more than 1300 experimentally supported targets. Each target site is described by the miRNA that binds it, the gene in which it occurs, the nature of the experiments that were conducted to test it, the sufficiency of the site to induce translational repression and/or cleavage, and the paper from which all these data were extracted. Additionally, the database is functionally linked to several other relevant and useful databases such as Ensembl, Hugo, UCSC and SwissProt. The TarBase5.0 database can be queried or downloaded from
PMCID: PMC2686456  PMID: 18957447
21.  Genome-Wide Analysis of Natural Selection on Human Cis-Elements 
PLoS ONE  2008;3(9):e3137.
It has been speculated that the polymorphisms in the non-coding portion of the human genome underlie much of the phenotypic variability among humans and between humans and other primates. If so, these genomic regions may be undergoing rapid evolutionary change, due in part to natural selection. However, the non-coding region is a heterogeneous mix of functional and non-functional regions. Furthermore, the functional regions are comprised of a variety of different types of elements, each under potentially different selection regimes.
Findings and Conclusions
Using the HapMap and Perlegen polymorphism data that map to a stringent set of putative binding sites in human proximal promoters, we apply the Derived Allele Frequency distribution test of neutrality to provide evidence that many human-specific and primate-specific binding sites are likely evolving under positive selection. We also discuss inherent limitations of publicly available human SNP datasets that complicate the inference of selection pressures. Finally, we show that the genes whose proximal binding sites contain high frequency derived alleles are enriched for positive regulation of protein metabolism and developmental processes. Thus our genome-scale investigation provides evidence for positive selection on putative transcription factor binding sites in human proximal promoters.
PMCID: PMC2522239  PMID: 18781197
22.  A Tutorial of the Poisson Random Field Model in Population Genetics 
Advances in Bioinformatics  2008;2008:257864.
Population genetics is the study of allele frequency changes driven by various evolutionary forces such as mutation, natural selection, and random genetic drift. Although natural selection is widely recognized as a bona-fide phenomenon, the extent to which it drives evolution continues to remain unclear and controversial. Various qualitative techniques, or so-called “tests of neutrality”, have been introduced to detect signatures of natural selection. A decade and a half ago, Stanley Sawyer and Daniel Hartl provided a mathematical framework, referred to as the Poisson random field (PRF), with which to determine quantitatively the intensity of selection on a particular gene or genomic region. The recent availability of large-scale genetic polymorphism data has sparked widespread interest in genome-wide investigations of natural selection. To that end, the original PRF model is of particular interest for geneticists and evolutionary genomicists. In this article, we will provide a tutorial of the mathematical derivation of the original Sawyer and Hartl PRF model.
PMCID: PMC2775679  PMID: 19920987
23.  miRGen: a database for the study of animal microRNA genomic organization and function 
Nucleic Acids Research  2006;35(Database issue):D149-D155.
miRGen is an integrated database of (i) positional relationships between animal miRNAs and genomic annotation sets and (ii) animal miRNA targets according to combinations of widely used target prediction programs. A major goal of the database is the study of the relationship between miRNA genomic organization and miRNA function. This is made possible by three integrated and user friendly interfaces. The Genomics interface allows the user to explore where whole-genome collections of miRNAs are located with respect to UCSC genome browser annotation sets such as Known Genes, Refseq Genes, Genscan predicted genes, CpG islands and pseudogenes. These miRNAs are connected through the Targets interface to their experimentally supported target genes from TarBase, as well as computationally predicted target genes from optimized intersections and unions of several widely used mammalian target prediction programs. Finally, the Clusters interface provides predicted miRNA clusters at any given inter-miRNA distance and provides specific functional information on the targets of miRNAs within each cluster. All of these unique features of miRGen are designed to facilitate investigations into miRNA genomic organization, co-transcription and targeting. miRGen can be freely accessed at .
PMCID: PMC1669779  PMID: 17108354
24.  Non-topographical contrast enhancement in the olfactory bulb 
BMC Neuroscience  2006;7:7.
Contrast enhancement within primary stimulus representations is a common feature of sensory systems that regulates the discrimination of similar stimuli. Whereas most sensory stimulus features can be mapped onto one or two dimensions of quality or location (e.g., frequency or retinotopy), the analogous similarities among odor stimuli are distributed high-dimensionally, necessarily yielding a chemotopically fragmented map upon the surface of the olfactory bulb. While olfactory contrast enhancement has been attributed to decremental lateral inhibitory processes among olfactory bulb projection neurons modeled after those in the retina, the two-dimensional topology of this mechanism is intrinsically incapable of mediating effective contrast enhancement on such fragmented maps. Consequently, current theories are unable to explain the existence of olfactory contrast enhancement.
We describe a novel neural circuit mechanism, non-topographical contrast enhancement (NTCE), which enables contrast enhancement among high-dimensional odor representations exhibiting unpredictable patterns of similarity. The NTCE algorithm relies solely on local intraglomerular computations and broad feedback inhibition, and is consistent with known properties of the olfactory bulb input layer. Unlike mechanisms based upon lateral projections, NTCE does not require a built-in foreknowledge of the similarities in molecular receptive ranges expressed by different olfactory bulb glomeruli, and is independent of the physical location of glomeruli within the olfactory bulb.
Non-topographical contrast enhancement demonstrates how intrinsically high-dimensional sensory data can be represented and processed within a physically two-dimensional neural cortex while retaining the capacity to represent stimulus similarity. In a biophysically constrained computational model of the olfactory bulb, NTCE successfully mediates contrast enhancement among odorant representations in the natural, high-dimensional similarity space defined by the olfactory receptor complement and underlies the concentration-independence of odor quality representations.
PMCID: PMC1368991  PMID: 16433921
25.  Prioritization of Genetic Variants in the microRNA Regulome as Functional Candidates in Genome-Wide Association Studies 
Human Mutation  2013;34(8):1049-1056.
Comprehensive analyses of results from genome-wide association studies (GWAS) have demonstrated that complex disease/trait-associated loci are enriched in gene regulatory regions of the genome. The search for causal regulatory variation has focused primarily on transcriptional elements, such as promoters and enhancers. microRNAs (miRNAs) are now widely appreciated as critical posttranscriptional regulators of gene expression and are thought to impart stability to biological systems. Naturally occurring genetic variation in the miRNA regulome is likely an important contributor to phenotypic variation in the human population. However, the extent to which polymorphic miRNA-mediated gene regulation underlies GWAS signals remains unclear. In this study, we have developed the most comprehensive bioinformatic analysis pipeline to date for cataloging and prioritizing variants in the miRNA regulome as functional candidates in GWAS. We highlight specific findings, including a variant in the promoter of the miRNA let-7 that may contribute to human height variation. We also provide a discussion of how our approach can be expanded in the future. Overall, we believe that the results of this study will be valuable for researchers interested in determining whether GWAS signals implicate the miRNA regulome in their disease/trait of interest.
PMCID: PMC3807557  PMID: 23595788
microRNA; GWAS; gene regulation; polymorphism; complex disease

