1.  The Genomes of Oryza sativa: A History of Duplications 
Yu, Jun | Wang, Jun | Lin, Wei | Li, Songgang | Li, Heng | Zhou, Jun | Ni, Peixiang | Dong, Wei | Hu, Songnian | Zeng, Changqing | Zhang, Jianguo | Zhang, Yong | Li, Ruiqiang | Xu, Zuyuan | Li, Shengting | Li, Xianran | Zheng, Hongkun | Cong, Lijuan | Lin, Liang | Yin, Jianning | Geng, Jianing | Li, Guangyuan | Shi, Jianping | Liu, Juan | Lv, Hong | Li, Jun | Wang, Jing | Deng, Yajun | Ran, Longhua | Shi, Xiaoli | Wang, Xiyin | Wu, Qingfa | Li, Changfeng | Ren, Xiaoyu | Wang, Jingqiang | Wang, Xiaoling | Li, Dawei | Liu, Dongyuan | Zhang, Xiaowei | Ji, Zhendong | Zhao, Wenming | Sun, Yongqiao | Zhang, Zhenpeng | Bao, Jingyue | Han, Yujun | Dong, Lingli | Ji, Jia | Chen, Peng | Wu, Shuming | Liu, Jinsong | Xiao, Ying | Bu, Dongbo | Tan, Jianlong | Yang, Li | Ye, Chen | Zhang, Jingfen | Xu, Jingyi | Zhou, Yan | Yu, Yingpu | Zhang, Bing | Zhuang, Shulin | Wei, Haibin | Liu, Bin | Lei, Meng | Yu, Hong | Li, Yuanzhe | Xu, Hao | Wei, Shulin | He, Ximiao | Fang, Lijun | Zhang, Zengjin | Zhang, Yunze | Huang, Xiangang | Su, Zhixi | Tong, Wei | Li, Jinhong | Tong, Zongzhong | Li, Shuangli | Ye, Jia | Wang, Lishun | Fang, Lin | Lei, Tingting | Chen, Chen | Chen, Huan | Xu, Zhao | Li, Haihong | Huang, Haiyan | Zhang, Feng | Xu, Huayong | Li, Na | Zhao, Caifeng | Li, Shuting | Dong, Lijun | Huang, Yanqing | Li, Long | Xi, Yan | Qi, Qiuhui | Li, Wenjie | Zhang, Bo | Hu, Wei | Zhang, Yanling | Tian, Xiangjun | Jiao, Yongzhi | Liang, Xiaohu | Jin, Jiao | Gao, Lei | Zheng, Weimou | Hao, Bailin | Liu, Siqi | Wang, Wen | Yuan, Longping | Cao, Mengliang | McDermott, Jason | Samudrala, Ram | Wang, Jian | Wong, Gane Ka-Shu | Yang, Huanming
PLoS Biology  2005;3(2):e38.
We report improved whole-genome shotgun sequences for the genomes of indica and japonica rice, both with multimegabase contiguity, or almost 1,000-fold improvement over the drafts of 2002. Tested against a nonredundant collection of 19,079 full-length cDNAs, 97.7% of the genes are aligned, without fragmentation, to the mapped super-scaffolds of one or the other genome. We introduce a gene identification procedure for plants that does not rely on similarity to known genes to remove erroneous predictions resulting from transposable elements. Using the available EST data to adjust for residual errors in the predictions, the estimated gene count is at least 38,000–40,000. Only 2%–3% of the genes are unique to any one subspecies, comparable to the amount of sequence that might still be missing. Despite this lack of variation in gene content, there is enormous variation in the intergenic regions. At least a quarter of the two sequences could not be aligned, and where they could be aligned, single nucleotide polymorphism (SNP) rates varied from as little as 3.0 SNP/kb in the coding regions to 27.6 SNP/kb in the transposable elements. A more inclusive new approach for analyzing duplication history is introduced here. It reveals an ancient whole-genome duplication, a recent segmental duplication on Chromosomes 11 and 12, and massive ongoing individual gene duplications. We find 18 distinct pairs of duplicated segments that cover 65.7% of the genome; 17 of these pairs date back to a common time before the divergence of the grasses. More important, ongoing individual gene duplications provide a never-ending source of raw material for gene genesis and are major contributors to the differences between members of the grass family.
Comparative genome sequencing of indica and japonica rice reveals that duplication of genes and genomic regions has played a major part in the evolution of grass genomes
PMCID: PMC546038  PMID: 15685292
2.  Whole-genome sequencing of Oryza brachyantha reveals mechanisms underlying Oryza genome evolution 
Nature Communications  2013;4:1595-.
The wild species of the genus Oryza contain a largely untapped reservoir of agronomically important genes for rice improvement. Here we report the 261-Mb de novo assembled genome sequence of Oryza brachyantha. Low activity of long-terminal repeat retrotransposons and massive internal deletions of ancient long-terminal repeat elements lead to the compact genome of Oryza brachyantha. We model 32,038 protein-coding genes in the Oryza brachyantha genome, of which only 70% are located in collinear positions in comparison with the rice genome. Analysing breakpoints of non-collinear genes suggests that double-strand break repair through non-homologous end joining has an important role in gene movement and erosion of collinearity in the Oryza genomes. Transition of euchromatin to heterochromatin in the rice genome is accompanied by segmental and tandem duplications, further expanded by transposable element insertions. The high-quality reference genome sequence of Oryza brachyantha provides an important resource for functional and evolutionary studies in the genus Oryza.
The wild rice species can be used as germplasm resources for this crop’s genetic improvement. Here Chen and colleagues report the de novo sequencing of the O. brachyantha genome, and identify the origin of genome size variation, the role of gene movement and its implications on heterochromatin evolution in the rice genome.
PMCID: PMC3615480  PMID: 23481403
3.  Proteomic identification and functional characterization of MYH9, Hsc70, and DNAJA1 as novel substrates of HDAC6 deacetylase activity 
Protein & Cell  2014;6(1):42-54.
Histone deacetylase 6 (HDAC6), a predominantly cytoplasmic protein deacetylase, participates in a wide range of cellular processes through its deacetylase activity. However, the diverse functions of HDAC6 cannot be fully elucidated with its known substrates. In an attempt to explore the substrate diversity of HDAC6, we performed quantitative proteomic analyses to monitor changes in the abundance of protein lysine acetylation in response to HDAC6 deficiency. We identified 107 proteins with elevated acetylation in the liver of HDAC6 knockout mice. Three cytoplasmic proteins, including myosin heavy chain 9 (MYH9), heat shock cognate protein 70 (Hsc70), and dnaJ homolog subfamily A member 1 (DNAJA1), were verified to interact with HDAC6. The acetylation levels of these proteins were negatively regulated by HDAC6 both in the mouse liver and in cultured cells. Functional studies reveal that HDAC6-mediated deacetylation modulates the actin-binding ability of MYH9 and the interaction between Hsc70 and DNAJA1. These findings consolidate the notion that HDAC6 serves as a critical regulator of protein acetylation with the capability of coordinating various cellular functions.
Electronic supplementary material
The online version of this article (doi:10.1007/s13238-014-0102-8) contains supplementary material, which is available to authorized users.
PMCID: PMC4286133  PMID: 25311840
HDAC6; substrate; lysine acetylation; quantitative proteomics; interaction
5.  CistromeFinder for ChIP-seq and DNase-seq data reuse 
Bioinformatics  2013;29(10):1352-1354.
Summary: Chromatin immunoprecipitation and DNase I hypersensitivity assays with high-throughput sequencing have greatly accelerated the understanding of transcriptional and epigenetic regulation, although data reuse for the community of experimental biologists has been challenging. We created a data portal CistromeFinder that can help query, evaluate and visualize publicly available Chromatin immunoprecipitation and DNase I hypersensitivity assays with high-throughput sequencing data in human and mouse. The database currently contains 6378 samples over 4391 datasets, 313 factors and 102 cell lines or cell populations. Each dataset has gone through a consistent analysis and quality control pipeline; therefore, users could evaluate the overall quality of each dataset before examining binding sites near their genes of interest. CistromeFinder is integrated with UCSC genome browser for visualization, Primer3Plus for ChIP-qPCR primer design and CistromeMap for submitting newly available datasets. It also allows users to leave comments to facilitate data evaluation and update.
Contact: or
PMCID: PMC3654708  PMID: 23508969
6.  Down-Regulation of Telomerase Activity and Activation of Caspase-3 Are Responsible for Tanshinone I-Induced Apoptosis in Monocyte Leukemia Cells in Vitro 
Tanshinone I (Tan-I) is a diterpene quinone extracted from the traditional herbal medicine Salvia miltiorrhiza Bunge. Recently, Tan-I has been reported to have anti-tumor effects. In this study, we investigated the growth inhibition and apoptosis inducing effects of Tan-I on three kinds of monocytic leukemia cells (U937, THP-1 and SHI 1). Cell viability was measured by MTT assay. Cell apoptosis was assessed by flow cytometry (FCM) and AnnexinV/PI staining. Reverse transcriptase polymerase chain reaction (RT-PCR) and PCR–enzyme-linked immunosorbent assay (ELISA) were used to detect human telomerase reverse transcriptase (hTERT) expression and telomerase activity before and after apoptosis. The activity of caspase-3 was determined by Caspase colorimetric assay kit and Western blot analysis. Expression of the anti-apoptotic gene Survivin was assayed by Western blot and Real-time RT-PCR using the ABI PRISM 7500 Sequence Detection System. The results revealed that Tan-I could inhibit the growth of these three kinds of leukemia cells and cause apoptosis in a time- and dose-dependent manner. After treatment by Tan-I for 48 h, Western blotting showed cleavage of the caspase-3 zymogen protein with the appearance of its 17-kD subunit, and a 89-kD cleavage product of poly (ADP-ribose) polymerase (PARP), a known substrate of caspase-3, was also found clearly. The expression of hTERT mRNA as well as activity of telomerase were decreased concurrently in a dose-dependent manner. Moreover, Real-time RT-PCR and Western blot revealed a significant down-regulation of Survivin. We therefore conclude that the induction of apoptosis by Tan-I in monocytic leukemia U937 THP-1 and SHI 1 cells is highly correlated with activation of caspase-3 and decreasing of hTERT mRNA expression and telomerase activity as well as down-regulation of Survivin expression. To our knowledge, this is the first report about the effects of Tan-I on monocytic leukemia cells.
PMCID: PMC2904915  PMID: 20640151
Tanshinone I (Tan-I); telomerase; survivin; leukemia
7.  Application of a Novel Population of Multipotent Stem Cells Derived from Skin Fibroblasts as Donor Cells in Bovine SCNT 
PLoS ONE  2015;10(1):e0114423.
Undifferentiated stem cells are better donor cells for somatic cell nuclear transfer (SCNT), resulting in more offspring than more differentiated cells. While various stem cell populations have been confirmed to exist in the skin, progress has been restricted due to the lack of a suitable marker for their prospective isolation. To address this fundamental issue, a marker is required that could unambiguously prove the differentiation state of the donor cells. We therefore utilized magnetic activated cell sorting (MACS) to separate a homogeneous population of small SSEA-4+ cells from a heterogeneous population of bovine embryonic skin fibroblasts (BEF). SSEA-4+ cells were 8-10 μm in diameter and positive for alkaline phosphatase (AP). The percentage of SSEA-4+ cells within the cultured BEF population was low (2-3%). Immunocytochemistry and PCR analyses revealed that SSEA-4+ cells expressed pluripotency-related markers, and could differentiate into cells comprising all three germ layers in vitro. They remained undifferentiated over 20 passages in suspension culture. In addition, cloned embryos derived from SSEA-4 cells showed significant differences in cleavage rate and blastocyst development when compared with those from BEF and SSEA-4− cells. Moreover, blastocysts derived from SSEA-4+ cells showed a higher total cell number and lower apoptotic index as compared to BEF and SSEA-4– derived cells. It is well known that nuclei from pluripotent stem cells yield a higher cloning efficiency than those from adult somatic cells, however, pluripotent stem cells are relatively difficult to obtain from bovine. The SSEA-4+ cells described in the current study provide an attractive candidate for SCNT and a promising platform for the generation of transgenic cattle.
PMCID: PMC4300223  PMID: 25602959
8.  Two Antarctic penguin genomes reveal insights into their evolutionary history and molecular changes related to the Antarctic environment 
GigaScience  2014;3(1):27.
Penguins are flightless aquatic birds widely distributed in the Southern Hemisphere. The distinctive morphological and physiological features of penguins allow them to live an aquatic life, and some of them have successfully adapted to the hostile environments in Antarctica. To study the phylogenetic and population history of penguins and the molecular basis of their adaptations to Antarctica, we sequenced the genomes of the two Antarctic dwelling penguin species, the Adélie penguin [Pygoscelis adeliae] and emperor penguin [Aptenodytes forsteri].
Phylogenetic dating suggests that early penguins arose ~60 million years ago, coinciding with a period of global warming. Analysis of effective population sizes reveals that the two penguin species experienced population expansions from ~1 million years ago to ~100 thousand years ago, but responded differently to the climatic cooling of the last glacial period. Comparative genomic analyses with other available avian genomes identified molecular changes in genes related to epidermal structure, phototransduction, lipid metabolism, and forelimb morphology.
Our sequencing and initial analyses of the first two penguin genomes provide insights into the timing of penguin origin, fluctuations in effective population sizes of the two penguin species over the past 10 million years, and the potential associations between these biological patterns and global climate change. The molecular changes compared with other avian genomes reflect both shared and diverse adaptations of the two penguin species to the Antarctic environment.
Electronic supplementary material
The online version of this article (doi:10.1186/2047-217X-3-27) contains supplementary material, which is available to authorized users.
PMCID: PMC4322438  PMID: 25671092
Penguins; Avian genomics; Evolution; Adaptation; Antarctica
9.  Metabolic and Functional Genomic Studies Identify Deoxythymidylate Kinase as a target in LKB1 Mutant Lung Cancer 
Cancer discovery  2013;3(8):870-879.
The LKB1/STK11 tumor suppressor encodes a serine/threonine kinase which coordinates cell growth, polarity, motility, and metabolism. In non-small cell lung cancer, LKB1 is somatically inactivated in 25-30% of cases, often concurrently with activating KRAS mutation. Here, we employed an integrative approach to define novel therapeutic targets in KRAS-driven LKB1 mutant lung cancers. High-throughput RNAi screens in lung cancer cell lines from genetically engineered mouse models driven by activated KRAS with or without coincident Lkb1 deletion led to the identification of Dtymk, encoding deoxythymidylate kinase which catalyzes dTTP biosynthesis, as synthetically lethal with Lkb1 deficiency in mouse and human lung cancer lines. Global metabolite profiling demonstrated that Lkb1-null cells had striking decreases in multiple nucleotide metabolites as compared to the Lkb1-wt cells. Thus, LKB1 mutant lung cancers have deficits in nucleotide metabolism conferring hypersensitivity to DTYMK inhibition, suggesting that DTYMK is a potential therapeutic target in this aggressive subset of tumors.
PMCID: PMC3753578  PMID: 23715154
LKB1; KRAS; DTYMK; CHEK1; NSCLC; GEMM-derived cell line; genome wide RNAi screen; metabolic profiling
10.  Identifying ChIP-seq enrichment using MACS 
Nature protocols  2012;7(9):10.1038/nprot.2012.101.
Model-based Analysis of ChIP-seq (MACS) is a computational algorithm that identifies genome-wide locations of transcription/chromatin factor binding or histone modification from ChIP-seq data. MACS consists of four steps: removing redundant reads, adjusting read position, calculating peak enrichment, and estimating the empirical false discovery rate. In this protocol, we provide a detailed demonstration of how to install MACS and how to use it to analyze three common types of ChIP-seq datasets with different characteristics: the sequence-specific transcription factor FoxA1, the histone modification mark H3K4me3 with sharp enrichment, and the H3K36me3 mark with broad enrichment. We also explain how to interpret and visualize the results of MACS analyses. The algorithm requires approximately 3 GB of RAM and 1.5 hours of computing time to analyze a ChIP-seq dataset containing 30 million reads, an estimate that increases with sequence coverage. MACS is open-source and is available from
PMCID: PMC3868217  PMID: 22936215
MACS; ChIP-seq; peak calling; transcription factor; histone modification
11.  CR Cistrome: a ChIP-Seq database for chromatin regulators and histone modification linkages in human and mouse 
Nucleic Acids Research  2013;42(D1):D450-D458.
Diversified histone modifications (HMs) are essential epigenetic features. They play important roles in fundamental biological processes including transcription, DNA repair and DNA replication. Chromatin regulators (CRs), which are indispensable in epigenetics, can mediate HMs to adjust chromatin structures and functions. With the development of ChIP-Seq technology, there is an opportunity to study CR and HM profiles at the whole-genome scale. However, no specific resource for the integration of CR ChIP-Seq data or CR-HM ChIP-Seq linkage pairs is currently available. Therefore, we constructed the CR Cistrome database, available online at and, to further elucidate CR functions and CR-HM linkages. Within this database, we collected all publicly available ChIP-Seq data on CRs in human and mouse and categorized the data into four cohorts: the reader, writer, eraser and remodeler cohorts, together with curated introductions and ChIP-Seq data analysis results. For the HM readers, writers and erasers, we provided further ChIP-Seq analysis data for the targeted HMs and schematized the relationships between them. We believe CR Cistrome is a valuable resource for the epigenetics community.
PMCID: PMC3965064  PMID: 24253304
12.  Zinc-finger nickase-mediated insertion of the lysostaphin gene into the beta-casein locus in cloned cows 
Nature Communications  2013;4:2565.
Zinc-finger nickases (ZFNickases) are a type of programmable nuclease that can be engineered from zinc-finger nucleases to induce site-specific single-strand breaks or nicks in genomic DNA, which result in homology-directed repair. Although zinc-finger nuclease-mediated gene disruption has been demonstrated in pigs and cattle, they have not been used to target gene addition into an endogenous gene locus in any large domestic species. Here we show in bovine fetal fibroblasts that targeting ZFNickases to the endogenous β-casein (CSN2) locus stimulates lysostaphin gene addition by homology-directed repair. We find that ZFNickase-treated cells can be successfully used in somatic cell nuclear transfer, resulting in live-born gene-targeted cows. Furthermore, the gene-targeted cows secrete lysostaphin in their milk and in vitro assays demonstrate the milk’s ability to kill Staphylococcus aureus. Our success with this strategy will facilitate new transgenic technologies beneficial to both agriculture and biomedicine.
Zinc-finger nickases are programmable nucleases that can be used to generate site-specific single-strand breaks in DNA. Liu et al. use this technology to insert an antimicrobial gene into the endogenous beta-casein locus in cloned cows, with the aim of providing protection against mastitis.
PMCID: PMC3826644  PMID: 24121612
13.  Establishment and Evaluation of a Stable Cattle Type II Alveolar Epithelial Cell Line 
PLoS ONE  2013;8(9):e76036.
Macrophages and dendritic cells are recognized as key players in the defense against mycobacterial infection. Recent research has confirmed that alveolar epithelial cells (AECs) also play important roles against mycobacterium infections. Thus, establishing a stable cattle AEC line for future endogenous immune research on bacterial invasion is necessary. In the present study, we first purified and immortalized type II AECs (AEC II cells) by transfecting them with a plasmid containing the human telomerase reverse trancriptase gene. We then tested whether or not the immortalized cells retained the basic physiological properties of primary AECs by reverse-transcription polymerase chain reaction and Western blot. Finally, we tested the secretion capacity of immortalized AEC II cells upon stimulation by bacterial invasion. The cattle type II alveolar epithelial cell line (HTERT-AEC II) that we established retained lung epithelial cell characteristics: the cells were positive for surfactants A and B, and they secreted tumor necrosis factor-α and interleukin-6 in response to bacterial invasion. Thus, the cell line we established is a potential tool for research on the relationship between AECs and Mycobacterium tuberculosis.
PMCID: PMC3784436  PMID: 24086682
14.  Sequencing of Fifty Human Exomes Reveals Adaptation to High Altitude 
Science (New York, N.Y.)  2010;329(5987):75-78.
Residents of the Tibetan Plateau show heritable adaptations to extreme altitude. We sequenced 50 exomes of ethnic Tibetans, encompassing coding sequences of 92% of human genes, with an average coverage of 18X per individual. Genes showing population-specific allele frequency changes, which represent strong candidates for altitude adaptation, were identified. The strongest signal of natural selection came from EPAS1, a transcription factor involved in response to hypoxia. One SNP at EPAS1 shows a 78% frequency difference between Tibetan and Han samples, representing the fastest allele frequency change observed at any human gene to date. This SNP’s association with erythrocyte abundance supports the role of EPAS1 in adaptation to hypoxia. Thus, a population genomic survey has revealed a functionally important locus in genetic adaptation to high altitude.
PMCID: PMC3711608  PMID: 20595611
15.  CistromeMap: a knowledgebase and web server for ChIP-Seq and DNase-Seq studies in mouse and human 
Bioinformatics  2012;28(10):1411-1412.
Summary: Transcription and chromatin regulators, and histone modifications play essential roles in gene expression regulation. We have created CistromeMap as a web server to provide a comprehensive knowledgebase of all of the publicly available ChIP-Seq and DNase-Seq data in mouse and human. We have also manually curated metadata to ensure annotation consistency, and developed a user-friendly display matrix for quick navigation and retrieval of data for specific factors, cells and papers. Finally, we provide users with summary statistics of ChIP-Seq and DNase-Seq studies.
Availability: Freely available on the web at
PMCID: PMC3348563  PMID: 22495751
16.  A Comprehensive View of Nuclear Receptor Cancer Cistromes 
Cancer research  2011;71(22):6940-6947.
Nuclear receptors (NRs) comprise a superfamily of ligand-activated transcription factors that play important roles in both physiology and diseases including cancer. The technologies of Chromatin ImmunoPrecipitation followed by array hybridization (ChIP-chip) or massively parallel sequencing (ChIP-seq) has been used to map, at an unprecedented rate, the in vivo genome-wide binding (cistrome) of NRs in both normal and cancer cells. We developed a curated database of 88 NR cistrome datasets and other associated high-throughput datasets, including 121 collaborating factor cistromes, 94 epigenomes and 319 transcriptomes. Through integrative analysis of the curated NR ChIP-chip/seq datasets, we discovered novel factor-specific noncanonical motifs that may have important regulatory roles. We also revealed a common feature of NR pioneering factors to recognize relatively short and AT-rich motifs. Most NRs bind predominantly to introns and distal intergenetic regions, and binding sites closer to transcription start sites (TSSs) were found to be neither stronger nor more evolutionarily conserved. Interestingly, while most NRs appear to be predominantly transcriptional activators, our analysis suggests that the binding of ESR1, RARA and RARG has both activating and repressive effects. Through meta-analysis of different omic data of the same cancer cell line model from multiple studies, we generated consensus cistrome and expression profiles. We further made probabilistic predictions of the NR target genes by integrating cistrome and transcriptome data, and validated the predictions using expression data from tumor samples. The final database, with comprehensive cistrome, epigenome, transcriptome datasets, and downstream analysis results, constitutes a valuable resource for the nuclear receptor and cancer community.
PMCID: PMC3610570  PMID: 21940749
17.  Exome Capture Sequencing of Adenoma Reveals Genetic Alterations in Multiple Cellular Pathways at the Early Stage of Colorectal Tumorigenesis 
PLoS ONE  2013;8(1):e53310.
Most of colorectal adenocarcinomas are believed to arise from adenomas, which are premalignant lesions. Sequencing the whole exome of the adenoma will help identifying molecular biomarkers that can predict the occurrence of adenocarcinoma more precisely and help understanding the molecular pathways underlying the initial stage of colorectal tumorigenesis. We performed the exome capture sequencing of the normal mucosa, adenoma and adenocarcinoma tissues from the same patient and sequenced the identified mutations in additional 73 adenomas and 288 adenocarcinomas. Somatic single nucleotide variations (SNVs) were identified in both the adenoma and adenocarcinoma by comparing with the normal control from the same patient. We identified 12 nonsynonymous somatic SNVs in the adenoma and 42 nonsynonymous somatic SNVs in the adenocarcinoma. Most of these mutations including OR6X1, SLC15A3, KRTHB4, RBFOX1, LAMA3, CDH20, BIRC6, NMBR, GLCCI1, EFR3A, and FTHL17 were newly reported in colorectal adenomas. Functional annotation of these mutated genes showed that multiple cellular pathways including Wnt, cell adhesion and ubiquitin mediated proteolysis pathways were altered genetically in the adenoma and that the genetic alterations in the same pathways persist in the adenocarcinoma. CDH20 and LAMA3 were mutated in the adenoma while NRXN3 and COL4A6 were mutated in the adenocarcinoma from the same patient, suggesting for the first time that genetic alterations in the cell adhesion pathway occur as early as in the adenoma. Thus, the comparison of genomic mutations between adenoma and adenocarcinoma provides us a new insight into the molecular events governing the early step of colorectal tumorigenesis.
PMCID: PMC3534699  PMID: 23301059
18.  Curcumin Attenuates Diabetic Neuropathic Pain by Downregulating TNF-α in a Rat Model 
The mechanisms involved in diabetic neuropathic pain are complex and involve peripheral and central pathophysiological phenomena. Proinflammatory tumour necrosis factor α (TNF-α) and TNF-α receptor 1, which are markers of inflammation, contribute to neuropathic pain. The purpose of this experimental study was to evaluate the effect of curcumin on diabetic pain in rats. We tested 24 rats with diabetes induced by a single intraperitoneal injection of streptozotocin and 24 healthy control rats. Twelve rats in each group received 60 mg/kg oral curcumin daily for 28 days, and the other 12 received vehicle. On days 7, 14, 21, and 28, we tested mechanical allodynia with von Frey hairs and thermal hyperalgesia with radiant heat. Markers of inflammation in the spinal cord dorsal horn on day 28 were estimated with a commercial assay and Western blot analysis. Compared to control rats, diabetic rats exhibited increased mean plasma glucose concentration, decreased mean body weight, and significant pain hypersensitivity, as evidenced by decreased paw withdrawal threshold to von Frey hairs and decreased paw withdrawal latency to heat. Curcumin significantly attenuated the diabetes-induced allodynia and hyperalgesia and reduced the expression of both TNF-α and TNF-α receptor 1. Curcumin seems to relieve diabetic hyperalgesia, possibly through an inhibitory action on TNF-α and TNF-α receptor 1.
PMCID: PMC3590595  PMID: 23471081
diabetic neuropathic pain; hyperalgesia; curcumin; tumour necrosis factor α; tumour necrosis factor α receptor 1.
19.  Systematic evaluation of factors influencing ChIP-seq fidelity 
Nature methods  2012;9(6):609-614.
We performed a systematic evaluation of how variations in sequencing depth and other parameters influence interpretation of Chromatin immunoprecipitation (ChIP) followed by sequencing (ChIP-seq) experiments. Using Drosophila S2 cells, we generated ChIP-seq datasets for a site-specific transcription factor (Suppressor of Hairy-wing) and a histone modification (H3K36me3). We detected a chromatin state bias, open chromatin regions yielded higher coverage, which led to false positives if not corrected and had a greater effect on detection specificity than any base-composition bias. Paired-end sequencing revealed that single-end data underestimated ChIP library complexity at high coverage. The removal of reads originating at the same base reduced false-positives while having little effect on detection sensitivity. Even at a depth of ~1 read/bp coverage of mappable genome, ~1% of the narrow peaks detected on a tiling array were missed by ChIP-seq. Evaluation of widely-used ChIP-seq analysis tools suggests that adjustments or algorithm improvements are required to handle datasets with deep coverage.
PMCID: PMC3477507  PMID: 22522655
20.  Direct Sequencing and Characterization of a Clinical Isolate of Epstein-Barr Virus from Nasopharyngeal Carcinoma Tissue by Using Next-Generation Sequencing Technology ▿ ‡  
Journal of Virology  2011;85(21):11291-11299.
Epstein-Barr virus (EBV)-encoded molecules have been detected in the tumor tissues of several cancers, including nasopharyngeal carcinoma (NPC), suggesting that EBV plays an important role in tumorigenesis. However, the nature of EBV with respect to genome width in vivo and whether EBV undergoes clonal expansion in the tumor tissues are still poorly understood. In this study, next-generation sequencing (NGS) was used to sequence DNA extracted directly from the tumor tissue of a patient with NPC. Apart from the human sequences, a clinically isolated EBV genome 164.7 kb in size was successfully assembled and named GD2 (GenBank accession number HQ020558). Sequence and phylogenetic analyses showed that GD2 was closely related to GD1, a previously assembled variant derived from a patient with NPC. GD2 contains the most prevalent EBV variants reported in Cantonese patients with NPC, suggesting that it might be the prevalent strain in this population. Furthermore, GD2 could be grouped into a single subtype according to common classification criteria and contains only 6 heterozygous point mutations, suggesting the monoclonal expansion of GD2 in NPC. This study represents the first genome-wide analysis of a clinical isolate of EBV directly extracted from NPC tissue. Our study reveals that NGS allows the characterization of genome-wide variations of EBV in clinical tumors and provides evidence of monoclonal expansion of EBV in vivo. The pipeline could also be applied to the study of other pathogen-related malignancies. With additional NGS studies of NPC, it might be possible to uncover the potential causative EBV variant involved in NPC.
PMCID: PMC3194977  PMID: 21880770
21.  Cistrome: an integrative platform for transcriptional regulation studies 
Genome Biology  2011;12(8):R83.
The increasing volume of ChIP-chip and ChIP-seq data being generated creates a challenge for standard, integrative and reproducible bioinformatics data analysis platforms. We developed a web-based application called Cistrome, based on the Galaxy open source framework. In addition to the standard Galaxy functions, Cistrome has 29 ChIP-chip- and ChIP-seq-specific tools in three major categories, from preliminary peak calling and correlation analyses to downstream genome feature association, gene expression analyses, and motif discovery. Cistrome is available at
PMCID: PMC3245621  PMID: 21859476
22.  SHP-2 Promotes the Maturation of Oligodendrocyte Precursor Cells Through Akt and ERK1/2 Signaling In Vitro 
PLoS ONE  2011;6(6):e21058.
Oligodendrocyte precursor cells (OPCs) differentiate into oligodendrocytes (OLs), which are responsible for myelination. Myelin is essential for saltatory nerve conduction in the vertebrate nervous system. However, the molecular mechanisms of maturation and myelination by oligodendrocytes remain elusive.
Methods and Findings
In the present study, we showed that maturation of oligodendrocytes was attenuated by sodium orthovanadate (a comprehensive inhibitor of tyrosine phosphatases) and PTPi IV (a specific inhibitor of SHP-2). It is also found that SHP-2 was persistently expressed during maturation process of OPCs. Down-regulation of endogenous SHP-2 led to impairment of oligodendrocytes maturation and this effect was triiodo-L-thyronine (T3) dependent. Furthermore, over-expression of SHP-2 was shown to promote maturation of oligodendrocytes. Finally, it has been identified that SHP-2 was involved in activation of Akt and extracellular-regulated kinases 1 and 2 (ERK1/2) induced by T3 in oligodendrocytes.
SHP-2 promotes oligodendrocytes maturation via Akt and ERK1/2 signaling in vitro.
PMCID: PMC3118803  PMID: 21701583
23.  Alterations of tumor-related genes do not exactly match the histopathological grade in gastric adenocarcinomas 
AIM: To investigate the diverse characteristics of different pathological gradings of gastric adenocarcinoma (GA) using tumor-related genes.
METHODS: GA tissues in different pathological gradings and normal tissues were subjected to tissue arrays. Expressions of 15 major tumor-related genes were detected by RNA in situ hybridization along with 3’ terminal digoxin-labeled anti-sense single stranded oligonucleotide and locked nucleic acid modifying probe within the tissue array. The data obtained were processed by support vector machines by four different feature selection methods to discover the respective critical gene/gene subsets contributing to the GA activities of different pathological gradings.
RESULTS: In comparison of poorly differentiated GA with normal tissues, tumor-related gene TP53 plays a key role, although other six tumor-related genes could also achieve the Area Under Curve (AUC) of the receiver operating characteristic independently by more than 80%. Comparing the well differentiated GA with normal tissues, we found that 11 tumor-related genes could independently obtain the AUC by more than 80%, but only the gene subsets, TP53, RB and PTEN, play a key role. Only the gene subsets, Bcl10, UVRAG, APC, Beclin1, NM23, PTEN and RB could distinguish between the poorly differentiated and well differentiated GA. None of a single gene could obtain a valid distinction.
CONCLUSION: Different from the traditional point of view, the well differentiated cancer tissues have more alterations of important tumor-related genes than the poorly differentiated cancer tissues.
PMCID: PMC2835792  PMID: 20205286
Pathological grading; Gastric adenocarcinoma; Tumor-related gene; Support vector machine; RNA in situ hybridization
24.  Seawater-Regulated Genes for Two-Component Systems and Outer Membrane Proteins in Myxococcus▿ †  
Journal of Bacteriology  2009;191(7):2102-2111.
When salt-tolerant Myxococcus cells are moved to a seawater environment, they change their growth, morphology, and developmental behavior. Outer membrane proteins and signal transduction pathways may play important roles in this shift. Chip hybridization targeting the genes predicted to encode 226 two-component signal transduction pathways and 74 outer membrane proteins of M. xanthus DK1622 revealed that the expression of 55 corresponding genes in the salt-tolerant strain M. fulvus HW-1 was significantly modified (most were downregulated) by the presence of seawater. Sequencing revealed that these seawater-regulated genes are highly homologous in both strains, suggesting that they have similar roles in the lifestyle of Myxococcus. Seven of the genes that had been reported in M. xanthus DK1622 are involved in different cellular processes, such as fruiting body development, sporulation, or motility. The outer membrane (Om) gene Om031 had the most significant change in expression (downregulated) in response to seawater, while the two-component system (Tc) gene Tc105 had the greatest increase in expression. Their homologues MXAN3106 and MXAN4042 were knocked out in DK1622 to analyze their functions in response to changes in salinity. In addition to having increased salt tolerance, sporulation of the MXAN3106 mutant was enhanced compared to that of DK1622, whereas mutating gene MXAN4042 produced contrary results. The results indicated that the genes that are involved in the cellular processes that are significantly changed in response to salinity may also be involved the salt tolerance of Myxococcus cells. Regulating the expression levels of these multifunctional genes may allow cells to quickly and efficiently respond to changing conditions in coastal environments.
PMCID: PMC2655515  PMID: 19151139
25.  Model-based Analysis of ChIP-Seq (MACS) 
Genome Biology  2008;9(9):R137.
MACS performs model-based analysis of ChIP-Seq data generated by short read sequencers.
We present Model-based Analysis of ChIP-Seq data, MACS, which analyzes data generated by short read sequencers such as Solexa's Genome Analyzer. MACS empirically models the shift size of ChIP-Seq tags, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome, allowing for more robust predictions. MACS compares favorably to existing ChIP-Seq peak-finding algorithms, and is freely available.
PMCID: PMC2592715  PMID: 18798982

