1.  Architecture of the human regulatory network derived from ENCODE data 
Nature  2012;489(7414):91-100.
Transcription factors (TFs) bind in a combinatorial fashion to specify the on-and-off states of genes; the ensemble of these binding events forms a regulatory network, constituting the wiring diagram for a cell. To examine the principles of the human transcriptional regulatory network, we determined the genomic binding information of 119 TFs in 458 ChIP-Seq experiments. We found the combinatorial, co-association of TFs to be highly context specific: distinct combinations of factors bind at specific genomic locations. In particular, there are significant differences in the binding proximal and distal to genes. We organized all the TF binding into a hierarchy and integrated it with other genomic information (e.g. miRNA regulation), forming a dense meta-network. Factors at different levels have different properties: for instance, top-level TFs more strongly influence expression and middle-level ones co-regulate targets to mitigate information-flow bottlenecks. Moreover, these co-regulations give rise to many enriched network motifs -- e.g. noise-buffering feed-forward loops. Finally, more connected network components are under stronger selection and exhibit a greater degree of allele-specific activity (i.e., differential binding to the two parental alleles). The regulatory information obtained in this study will be crucial for interpreting personal genome sequences and understanding basic principles of human biology and disease.
PMCID: PMC4154057  PMID: 22955619
2.  The Evolution of Lineage-Specific Regulatory Activities in the Human Embryonic Limb 
Cell  2013;154(1):185-196.
The evolution of human anatomical features likely involved changes in gene regulation during development. However, the nature and extent of human-specific developmental regulatory functions remain unknown. We obtained a genome-wide view of cis-regulatory evolution in human embryonic tissues by comparing the histone modification H3K27ac, which provides a quantitative readout of promoter and enhancer activity, during human, rhesus, and mouse limb development. Based on increased H3K27ac, we find that 13% of promoters and 11% of enhancers have gained activity on the human lineage since the human-rhesus divergence. These gains largely arose by modification of ancestral regulatory activities in the limb or potential co-option from other tissues and are likely to have heterogeneous genetic causes. Most enhancers that exhibit gain of activity in humans originated in mammals. Gains at promoters and enhancers in the human limb are associated with increased gene expression, suggesting they include molecular drivers of human morphological evolution.
PMCID: PMC3785101  PMID: 23827682
3.  Identification of Glyceraldehyde 3-Phosphate Dehydrogenase Sequence and Expression Profiles in Tree Shrew (Tupaia belangeri) 
PLoS ONE  2014;9(6):e98552.
The tree shrews (Tupaia belangeri) diverged from the primate order (Primates) and are classified as Scandentia, a separate taxonomic group of mammals. The tree shrew has been suggested to use an animal model to study human disease but the genomic sequences of tree shrew is largely unidentified. Here we identified the full-length cDNA sequence of a housekeeping gene, Glyceraldehyde 3-phosphate Dehydrogenase (GAPDH), in tree shrew. We further constructed a phylogenetic family tree base on GAPDH molecules of various organisms and compared GAPDH sequences with human and other small experimental animals. These study revealed that tree shrew was closer to human than mouse, rat, rabbit and guinea pig. The Quantitative Reverse Transcription PCR and western blot analysis further demonstrated that GAPDH expressed in various tissues in tree shrew as a general conservative housekeeping proteins as in human. Our findings provide the novel genetic knowledge of the tree shrew and strong evidences that tree shrew can be an experimental model system to study human disorders.
PMCID: PMC4041755  PMID: 24887411
4.  DNA-templated synthesis of PtAu bimetallic nanoparticle/graphene nanocomposites and their application in glucose biosensor 
In this paper, single-stranded DNA (ss-DNA) is demonstrated to functionalize graphene (GR) and to further guide the growth of PtAu bimetallic nanoparticles (PtAuNPs) on GR with high densities and dispersion. The obtained nanocomposites (PtAuNPs/ss-DNA/GR) were characterized by transmission electron microscopy (TEM), energy-dispersive X-ray spectrometer (EDS), and electrochemical techniques. Then, an enzyme nanoassembly was prepared by self-assembling glucose oxidase (GOD) on PtAuNP/ss-DNA/GR nanocomposites (GOD/PtAuNPs/ss-DNA/GR). The nanocomposites provided a suitable microenvironment for GOD to retain its biological activity. The direct and reversible electron transfer process between the active site of GOD and the modified electrode was realized without any extra electron mediator. Thus, the prepared GOD/PtAuNP/ss-DNA/GR electrode was proposed as a biosensor for the quantification of glucose. The effects of pH, applied potential, and temperature on the performance of the biosensor were discussed in detail and were optimized. Under optimal conditions, the biosensor showed a linearity with glucose concentration in the range of 1.0 to 1,800 μM with a detection limit of 0.3 μM (S/N = 3). The results demonstrate that the developed approach provides a promising strategy to improve the sensitivity and enzyme activity of electrochemical biosensors.
PMCID: PMC3941606  PMID: 24572068
Graphene; PtAu bimetallic nanoparticles; Glucose oxidase; Biosensor; Glucose
5.  Identification and determination of major constituents in a traditional Chinese medicine compound recipe Xiongdankaiming tablet using HPLC-PDA/ESI-MSn and HPLC-UV/ELSD* #  
Xiongdankaiming tablet (XDKMT), a well-known compound in traditional Chinese medicine, is widely used for the treatment of acute iridocyclitis and primary open-angle glaucoma. In this paper, accurate and reliable methods were developed for the identification of 20 constituents using high-performance liquid chromatography with photo-diode array and electron spray ionization-mass spectrometry (HPLC-PDA/ESI-MSn), and determination of nine of the constituents (chlorogenic acid, gentiopicroside, isochlorogenic acid B, diosmetin-7-O-β-D-glucopyranoside, apigenin, diosmetin, tauroursodeoxycholic acid, acacetin, and taurochenodeoxycholic acid) was developed using HPLC with ultraviolet absorption detector and evaporative light scattering detector (HPLC-UV/ELSD) for the first time. The best results were obtained on a Zorbax SB-C18 column with gradient elution using water (0.1% formic acid) (A) and methanol (0.1% formic acid) (B) at a flow rate of 0.7 ml/min. Tauroursodeoxycholic acid and taurochenodeoxycholic acid, owing to their low UV absorption, were detected by ELSD. The other seven compounds were analyzed by HPLC-UV with variable wavelengths. The calibration curves of all nine constituents showed good linear regression (R 2>0.9996) within the linearity ranges. The limits of detection and quantification were in the ranges of 0.0460–9.90 μg/ml and 0.115–24.8 μg/ml, respectively. The accuracy, in terms of recovery, varied from 95.3% to 104.9% with relative standard deviations (RSDs) less than 4.4%. Precision (with the intra- and inter-day variations less than 4.4%) was also suitable for its intended use. The developed method was successfully applied for the analysis of major components in XDKMT, which provides an appropriate method for the quality control of XDKMT.
PMCID: PMC3709065  PMID: 23825146
Traditional Chinese medicine (TCM) compound recipe; Xiongdankaiming tablet; HPLC-UV/ELSD; HPLC-MS; Quality control
6.  RNA-Seq Profiling of Spinal Cord Motor Neurons from a Presymptomatic SOD1 ALS Mouse 
PLoS ONE  2013;8(1):e53575.
Mechanisms involved with degeneration of motor neurons in amyotrophic lateral sclerosis (ALS; Lou Gehrig's Disease) are poorly understood, but genetically inherited forms, comprising ∼10% of the cases, are potentially informative. Recent observations that several inherited forms of ALS involve the RNA binding proteins TDP43 and FUS raise the question as to whether RNA metabolism is generally disturbed in ALS. Here we conduct whole transcriptome profiling of motor neurons from a mouse strain, transgenic for a mutant human SOD1 (G85R SOD1-YFP), that develops symptoms of ALS and paralyzes at 5–6 months of age. Motor neuron cell bodies were laser microdissected from spinal cords at 3 months of age, a time when animals were presymptomatic but showed aggregation of the mutant protein in many lower motor neuron cell bodies and manifested extensive neuromuscular junction morphologic disturbance in their lower extremities. We observed only a small number of transcripts with altered expression levels or splicing in the G85R transgenic compared to age-matched animals of a wild-type SOD1 transgenic strain. Our results indicate that a major disturbance of polyadenylated RNA metabolism does not occur in motor neurons of mutant SOD1 mice, suggesting that the toxicity of the mutant protein lies at the level of translational or post-translational effects.
PMCID: PMC3536741  PMID: 23301088
7.  MCP-1-Induced Histamine Release from Mast Cells Is Associated with Development of Interstitial Cystitis/Bladder Pain Syndrome in Rat Models 
Mediators of Inflammation  2012;2012:358184.
Objective. Interstitial cystitis/bladder pain syndrome (IC/BPS) is characterized by overexpression of monocyte chemoattractant protein-1 (MCP-1) in bladder tissues and induction of mast cell (MC) degranulation. This study was undertaken to explore the mechanism of action of MCP-1 in the development of IC/BPS. Methods. A rat model of IC/BPS was developed by perfusing bladders of nine SPF- grade female Sprague-Dawley rats with protamine sulfate and lipopolysaccharide (PS+LPS). MCP-1 and histamine levels in bladder tissue and urine were detected by immunohistochemistry and ELISA. MC degranulation was measured by immunofluorescence techniques and chemokine (C-C motif) receptor 2 (CCR2) was assayed by flow cytometry. Results. Increased MCP-1 expression in bladder tissue and elevated MCP-1 and histamine levels were observed in the urine of LS+LPS-treated rats. This was accompanied by the expression of CCR2 on MC surfaces, suggesting MCP-1 may induce MC degranulation through CCR2. Exposure to LPS stimulated MCP-1 expression in bladder epithelial cells, and exposure to MCP-1 induced histamine release from MCs. Conclusions. MCP-1 upregulation in IC/BPS is one of possible contributing factors inducing histamine release from MCs. CCR2 is involved in the process of mast cell degranulation in bladder tissues. These changes may contribute to the development of symptoms of IC/BPS.
PMCID: PMC3459284  PMID: 23049171
8.  Classification of human genomic regions based on experimentally determined binding sites of more than 100 transcription-related factors 
Genome Biology  2012;13(9):R48.
Transcription factors function by binding different classes of regulatory elements. The Encyclopedia of DNA Elements (ENCODE) project has recently produced binding data for more than 100 transcription factors from about 500 ChIP-seq experiments in multiple cell types. While this large amount of data creates a valuable resource, it is nonetheless overwhelmingly complex and simultaneously incomplete since it covers only a small fraction of all human transcription factors.
As part of the consortium effort in providing a concise abstraction of the data for facilitating various types of downstream analyses, we constructed statistical models that capture the genomic features of three paired types of regions by machine-learning methods: firstly, regions with active or inactive binding; secondly, those with extremely high or low degrees of co-binding, termed HOT and LOT regions; and finally, regulatory modules proximal or distal to genes. From the distal regulatory modules, we developed computational pipelines to identify potential enhancers, many of which were validated experimentally. We further associated the predicted enhancers with potential target transcripts and the transcription factors involved. For HOT regions, we found a significant fraction of transcription factor binding without clear sequence motifs and showed that this observation could be related to strong DNA accessibility of these regions.
Overall, the three pairs of regions exhibit intricate differences in chromosomal locations, chromatin features, factors that bind them, and cell-type specificity. Our machine learning approach enables us to identify features potentially general to all transcription factors, including those not included in the data.
PMCID: PMC3491392  PMID: 22950945
9.  IQSeq: Integrated Isoform Quantification Analysis Based on Next-Generation Sequencing 
PLoS ONE  2012;7(1):e29175.
With the recent advances in high-throughput RNA sequencing (RNA-Seq), biologists are able to measure transcription with unprecedented precision. One problem that can now be tackled is that of isoform quantification: here one tries to reconstruct the abundances of isoforms of a gene. We have developed a statistical solution for this problem, based on analyzing a set of RNA-Seq reads, and a practical implementation, available from, in a tool we call IQSeq (Isoform Quantification in next-generation Sequencing). Here, we present theoretical results which IQSeq is based on, and then use both simulated and real datasets to illustrate various applications of the tool. In order to measure the accuracy of an isoform-quantification result, one would try to estimate the average variance of the estimated isoform abundances for each gene (based on resampling the RNA-seq reads), and IQSeq has a particularly fast algorithm (based on the Fisher Information Matrix) for calculating this, achieving a speedup of times compared to brute-force resampling. IQSeq also calculates an information theoretic measure of overall transcriptome complexity to describe isoform abundance for a whole experiment. IQSeq has many features that are particularly useful in RNA-Seq experimental design, allowing one to optimally model the integration of different sequencing technologies in a cost-effective way. In particular, the IQSeq formalism integrates the analysis of different sample (i.e. read) sets generated from different technologies within the same statistical framework. It also supports a generalized statistical partial-sample-generation function to model the sequencing process. This allows one to have a modular, “plugin-able” read-generation function to support the particularities of the many evolving sequencing technologies.
PMCID: PMC3253133  PMID: 22238592
10.  Mapping copy number variation by population scale genome sequencing 
Nature  2011;470(7332):59-65.
Genomic structural variants (SVs) are abundant in humans, differing from other variation classes in extent, origin, and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (i.e., copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analyzing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.
PMCID: PMC3077050  PMID: 21293372
11.  AlleleSeq: analysis of allele-specific expression and binding in a network framework 
A computational pipeline for constructing a personal diploid genome and determining sites of allele-specific activity is developed. Using a regulatory network framework, allele-specific binding and expression are found to be significantly coordinated across the genome.
Software was developed for building a personal diploid genome sequence, and determining sites of allele-specific binding and expression (AlleleSeq).This computational pipeline was used to analyze variation data, and deeply sequenced RNA-Seq and ChIP-Seq datasets, for individual NA12878 from the 1000 Genomes Project.The interaction between allele-specific binding and allele-specific expression are investigated, revealing clear coordination.
To study allele-specific expression (ASE) and binding (ASB), that is, differences between the maternally and paternally derived alleles, we have developed a computational pipeline (AlleleSeq). Our pipeline initially constructs a diploid personal genome sequence (and corresponding personalized gene annotation) using genomic sequence variants (SNPs, indels, and structural variants), and then identifies allele-specific events with significant differences in the number of mapped reads between maternal and paternal alleles. There are many technical challenges in the construction and alignment of reads to a personal diploid genome sequence that we address, for example, bias of reads mapping to the reference allele. We have applied AlleleSeq to variation data for NA12878 from the 1000 Genomes Project as well as matched, deeply sequenced RNA-Seq and ChIP-Seq data sets generated for this purpose. In addition to observing fairly widespread allele-specific behavior within individual functional genomic data sets (including results consistent with X-chromosome inactivation), we can study the interaction between ASE and ASB. Furthermore, we investigate the coordination between ASE and ASB from multiple transcription factors events using a regulatory network framework. Correlation analyses and network motifs show mostly coordinated ASB and ASE.
PMCID: PMC3208341  PMID: 21811232
allele-specific; ChIP-Seq; networks; RNA-Seq
12.  Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project 
Gerstein, Mark B. | Lu, Zhi John | Van Nostrand, Eric L. | Cheng, Chao | Arshinoff, Bradley I. | Liu, Tao | Yip, Kevin Y. | Robilotto, Rebecca | Rechtsteiner, Andreas | Ikegami, Kohta | Alves, Pedro | Chateigner, Aurelien | Perry, Marc | Morris, Mitzi | Auerbach, Raymond K. | Feng, Xin | Leng, Jing | Vielle, Anne | Niu, Wei | Rhrissorrakrai, Kahn | Agarwal, Ashish | Alexander, Roger P. | Barber, Galt | Brdlik, Cathleen M. | Brennan, Jennifer | Brouillet, Jeremy Jean | Carr, Adrian | Cheung, Ming-Sin | Clawson, Hiram | Contrino, Sergio | Dannenberg, Luke O. | Dernburg, Abby F. | Desai, Arshad | Dick, Lindsay | Dosé, Andréa C. | Du, Jiang | Egelhofer, Thea | Ercan, Sevinc | Euskirchen, Ghia | Ewing, Brent | Feingold, Elise A. | Gassmann, Reto | Good, Peter J. | Green, Phil | Gullier, Francois | Gutwein, Michelle | Guyer, Mark S. | Habegger, Lukas | Han, Ting | Henikoff, Jorja G. | Henz, Stefan R. | Hinrichs, Angie | Holster, Heather | Hyman, Tony | Iniguez, A. Leo | Janette, Judith | Jensen, Morten | Kato, Masaomi | Kent, W. James | Kephart, Ellen | Khivansara, Vishal | Khurana, Ekta | Kim, John K. | Kolasinska-Zwierz, Paulina | Lai, Eric C. | Latorre, Isabel | Leahey, Amber | Lewis, Suzanna | Lloyd, Paul | Lochovsky, Lucas | Lowdon, Rebecca F. | Lubling, Yaniv | Lyne, Rachel | MacCoss, Michael | Mackowiak, Sebastian D. | Mangone, Marco | McKay, Sheldon | Mecenas, Desirea | Merrihew, Gennifer | Miller, David M. | Muroyama, Andrew | Murray, John I. | Ooi, Siew-Loon | Pham, Hoang | Phippen, Taryn | Preston, Elicia A. | Rajewsky, Nikolaus | Rätsch, Gunnar | Rosenbaum, Heidi | Rozowsky, Joel | Rutherford, Kim | Ruzanov, Peter | Sarov, Mihail | Sasidharan, Rajkumar | Sboner, Andrea | Scheid, Paul | Segal, Eran | Shin, Hyunjin | Shou, Chong | Slack, Frank J. | Slightam, Cindie | Smith, Richard | Spencer, William C. | Stinson, E. O. | Taing, Scott | Takasaki, Teruaki | Vafeados, Dionne | Voronina, Ksenia | Wang, Guilin | Washington, Nicole L. | Whittle, Christina M. | Wu, Beijing | Yan, Koon-Kiu | Zeller, Georg | Zha, Zheng | Zhong, Mei | Zhou, Xingliang | Ahringer, Julie | Strome, Susan | Gunsalus, Kristin C. | Micklem, Gos | Liu, X. Shirley | Reinke, Valerie | Kim, Stuart K. | Hillier, LaDeana W. | Henikoff, Steven | Piano, Fabio | Snyder, Michael | Stein, Lincoln | Lieb, Jason D. | Waterston, Robert H.
Science (New York, N.Y.)  2010;330(6012):1775-1787.
We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor–binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor–binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.
PMCID: PMC3142569  PMID: 21177976
13.  Application of endoscopic hemoclips for nonvariceal bleeding in the upper gastrointestinal tract 
AIM: To investigate acute nonvariceal bleeding in the upper gastrointestinal (GI) tract and evaluate the effects of endoscopic hemoclipping.
METHODS: Sixty-eight cases of acute nonvariceal bleeding in the upper GI tract were given endoscopic treatment with hemoclip application. Clinical data, endoscopic findings, and the effects of the therapy were evaluated.
RESULTS: The 68 cases (male:female = 42:26, age from 9 to 70 years, average 54.4) presented with hematemesis in 26 cases (38.2%), melena in nine cases (13.3%), and both in 33 cases (48.5%). The causes of the bleeding included gastric ulcer (29 cases), duodenal ulcer (11 cases), Dieulafoy’s lesion (11 cases), Mallory-Weiss syndrome (six cases), post-operative (three cases), post-polypectomy bleeding (five cases), and post-sphincterotomy bleeding (three cases); 42 cases had active bleeding. The mean number of hemoclips applied was four. Permanent hemostasis was obtained by hemoclip application in 59 cases; 6 cases required emergent surgery (three cases had peptic ulcers, one had Dieulafoy’s lesion, and two were caused by sphincterotomy); three patients died (two had Dieulafoy’s lesion and one was caused by sphincterotomy); and one had recurrent bleeding with Dieulafoy’s lesion 10 mo later, but in a different location.
CONCLUSION: Endoscopic hemoclip application was an effective and safe method for acute nonvariceal bleeding in the upper GI tract with satisfactory outcomes.
PMCID: PMC2744190  PMID: 19750577
Gastrointestinal hemorrhage; Endoscopy; Hemoclip; Hemostasis
14.  Overexpression of cyclooxygenase-2 in human HepG2, Bel-7402 and SMMC-7721 hepatoma cell lines and mechanism of cyclooxygenase-2 selective inhibitor celecoxib-induced cell growth inhibition and apoptosis 
AIM: To investigate the cyclooxygenase-2 (COX-2) expression level in human HepG2, Bel-7402 and SMMC-7721 hepatoma cell lines and the molecular mechanism of COX-2 selective inhibitor celecoxib-induced cell growth inhibition and cell apoptosis.
METHODS: Hepatoma cells were cultured and treated with celecoxib. Cell in situ hybridization (ISH) and immunocytochemistry were used to detect COX-2 mRNA and protein expression. Proliferating cell nuclear antigen and phosphorylated Akt were also detected by immunocytochemistry assay. Cell growth rates were assessed by 3-(4, 5-dimethylthiazol-2-yl-2, 5-diphenylte-trazolium (MTT) bromide colorimetric assay. Celecoxib-induced cell apoptosis was measured by terminal deoxynucleotidyl transferase-mediated dUTP nick end labeling (TUNEL) and flow cytometry (FCM). The phosphorylated Akt and activated fragments of caspase-9, caspase-3 were examined by Western blotting analysis.
RESULTS: Increased COX-2 mRNA and protein expression were detected in all three hepatoma cell lines. Celecoxib could significantly inhibit cell growth and the inhibitory effect was in a dose- and time-dependent manner evidenced by MTT assays and morphological changes. The apoptotic index measured by TUNEL increased correspondingly with the increased concentration of celecoxib and the reaction time. With 50 μmol/L celecoxib treatment for 24 h, the apoptotic index of HepG2, BEL-7402 and SMMC-7721 cells was 25.01±3.08%, 26.40±3.05%, and 30.60±2.89%, respectively. Western blotting analysis showed remarkable activation of caspase-9, caspase-3 and dephosphorylation of Akt (Thr308). Immunocytochemistry also showed the reduction of PCNA expression and phosphorylation Akt (Thr308) after treatment with celecoxib.
CONCLUSION: COX-2 mRNA and protein overexpression in HepG2, Bel-7402 and SMMC-7721 cell lines correlate with the increased cell growth rate. Celecoxib can inhibit proliferation and induce apoptosis of hepatoma cell strains in a dose- and time-dependent manner.
PMCID: PMC4320331  PMID: 16419156
Apoptosis; Akt; Celecoxib; Caspase; Cell proliferation; COX-2; HCC; PCNA

