Search tips
Search criteria

Results 1-25 (96)

Clipboard (0)
more »
Year of Publication
more »
1.  The Genomes of Oryza sativa: A History of Duplications 
Yu, Jun | Wang, Jun | Lin, Wei | Li, Songgang | Li, Heng | Zhou, Jun | Ni, Peixiang | Dong, Wei | Hu, Songnian | Zeng, Changqing | Zhang, Jianguo | Zhang, Yong | Li, Ruiqiang | Xu, Zuyuan | Li, Shengting | Li, Xianran | Zheng, Hongkun | Cong, Lijuan | Lin, Liang | Yin, Jianning | Geng, Jianing | Li, Guangyuan | Shi, Jianping | Liu, Juan | Lv, Hong | Li, Jun | Wang, Jing | Deng, Yajun | Ran, Longhua | Shi, Xiaoli | Wang, Xiyin | Wu, Qingfa | Li, Changfeng | Ren, Xiaoyu | Wang, Jingqiang | Wang, Xiaoling | Li, Dawei | Liu, Dongyuan | Zhang, Xiaowei | Ji, Zhendong | Zhao, Wenming | Sun, Yongqiao | Zhang, Zhenpeng | Bao, Jingyue | Han, Yujun | Dong, Lingli | Ji, Jia | Chen, Peng | Wu, Shuming | Liu, Jinsong | Xiao, Ying | Bu, Dongbo | Tan, Jianlong | Yang, Li | Ye, Chen | Zhang, Jingfen | Xu, Jingyi | Zhou, Yan | Yu, Yingpu | Zhang, Bing | Zhuang, Shulin | Wei, Haibin | Liu, Bin | Lei, Meng | Yu, Hong | Li, Yuanzhe | Xu, Hao | Wei, Shulin | He, Ximiao | Fang, Lijun | Zhang, Zengjin | Zhang, Yunze | Huang, Xiangang | Su, Zhixi | Tong, Wei | Li, Jinhong | Tong, Zongzhong | Li, Shuangli | Ye, Jia | Wang, Lishun | Fang, Lin | Lei, Tingting | Chen, Chen | Chen, Huan | Xu, Zhao | Li, Haihong | Huang, Haiyan | Zhang, Feng | Xu, Huayong | Li, Na | Zhao, Caifeng | Li, Shuting | Dong, Lijun | Huang, Yanqing | Li, Long | Xi, Yan | Qi, Qiuhui | Li, Wenjie | Zhang, Bo | Hu, Wei | Zhang, Yanling | Tian, Xiangjun | Jiao, Yongzhi | Liang, Xiaohu | Jin, Jiao | Gao, Lei | Zheng, Weimou | Hao, Bailin | Liu, Siqi | Wang, Wen | Yuan, Longping | Cao, Mengliang | McDermott, Jason | Samudrala, Ram | Wang, Jian | Wong, Gane Ka-Shu | Yang, Huanming | Bennetzen, Jeff
PLoS Biology  2005;3(2):e38.
We report improved whole-genome shotgun sequences for the genomes of indica and japonica rice, both with multimegabase contiguity, or almost 1,000-fold improvement over the drafts of 2002. Tested against a nonredundant collection of 19,079 full-length cDNAs, 97.7% of the genes are aligned, without fragmentation, to the mapped super-scaffolds of one or the other genome. We introduce a gene identification procedure for plants that does not rely on similarity to known genes to remove erroneous predictions resulting from transposable elements. Using the available EST data to adjust for residual errors in the predictions, the estimated gene count is at least 38,000–40,000. Only 2%–3% of the genes are unique to any one subspecies, comparable to the amount of sequence that might still be missing. Despite this lack of variation in gene content, there is enormous variation in the intergenic regions. At least a quarter of the two sequences could not be aligned, and where they could be aligned, single nucleotide polymorphism (SNP) rates varied from as little as 3.0 SNP/kb in the coding regions to 27.6 SNP/kb in the transposable elements. A more inclusive new approach for analyzing duplication history is introduced here. It reveals an ancient whole-genome duplication, a recent segmental duplication on Chromosomes 11 and 12, and massive ongoing individual gene duplications. We find 18 distinct pairs of duplicated segments that cover 65.7% of the genome; 17 of these pairs date back to a common time before the divergence of the grasses. More important, ongoing individual gene duplications provide a never-ending source of raw material for gene genesis and are major contributors to the differences between members of the grass family.
Comparative genome sequencing of indica and japonica rice reveals that duplication of genes and genomic regions has played a major part in the evolution of grass genomes
PMCID: PMC546038  PMID: 15685292
2.  Whole-genome sequencing of Oryza brachyantha reveals mechanisms underlying Oryza genome evolution 
Nature Communications  2013;4:1595-.
The wild species of the genus Oryza contain a largely untapped reservoir of agronomically important genes for rice improvement. Here we report the 261-Mb de novo assembled genome sequence of Oryza brachyantha. Low activity of long-terminal repeat retrotransposons and massive internal deletions of ancient long-terminal repeat elements lead to the compact genome of Oryza brachyantha. We model 32,038 protein-coding genes in the Oryza brachyantha genome, of which only 70% are located in collinear positions in comparison with the rice genome. Analysing breakpoints of non-collinear genes suggests that double-strand break repair through non-homologous end joining has an important role in gene movement and erosion of collinearity in the Oryza genomes. Transition of euchromatin to heterochromatin in the rice genome is accompanied by segmental and tandem duplications, further expanded by transposable element insertions. The high-quality reference genome sequence of Oryza brachyantha provides an important resource for functional and evolutionary studies in the genus Oryza.
The wild rice species can be used as germplasm resources for this crop’s genetic improvement. Here Chen and colleagues report the de novo sequencing of the O. brachyantha genome, and identify the origin of genome size variation, the role of gene movement and its implications on heterochromatin evolution in the rice genome.
PMCID: PMC3615480  PMID: 23481403
3.  Down-Regulation of Telomerase Activity and Activation of Caspase-3 Are Responsible for Tanshinone I-Induced Apoptosis in Monocyte Leukemia Cells in Vitro 
Tanshinone I (Tan-I) is a diterpene quinone extracted from the traditional herbal medicine Salvia miltiorrhiza Bunge. Recently, Tan-I has been reported to have anti-tumor effects. In this study, we investigated the growth inhibition and apoptosis inducing effects of Tan-I on three kinds of monocytic leukemia cells (U937, THP-1 and SHI 1). Cell viability was measured by MTT assay. Cell apoptosis was assessed by flow cytometry (FCM) and AnnexinV/PI staining. Reverse transcriptase polymerase chain reaction (RT-PCR) and PCR–enzyme-linked immunosorbent assay (ELISA) were used to detect human telomerase reverse transcriptase (hTERT) expression and telomerase activity before and after apoptosis. The activity of caspase-3 was determined by Caspase colorimetric assay kit and Western blot analysis. Expression of the anti-apoptotic gene Survivin was assayed by Western blot and Real-time RT-PCR using the ABI PRISM 7500 Sequence Detection System. The results revealed that Tan-I could inhibit the growth of these three kinds of leukemia cells and cause apoptosis in a time- and dose-dependent manner. After treatment by Tan-I for 48 h, Western blotting showed cleavage of the caspase-3 zymogen protein with the appearance of its 17-kD subunit, and a 89-kD cleavage product of poly (ADP-ribose) polymerase (PARP), a known substrate of caspase-3, was also found clearly. The expression of hTERT mRNA as well as activity of telomerase were decreased concurrently in a dose-dependent manner. Moreover, Real-time RT-PCR and Western blot revealed a significant down-regulation of Survivin. We therefore conclude that the induction of apoptosis by Tan-I in monocytic leukemia U937 THP-1 and SHI 1 cells is highly correlated with activation of caspase-3 and decreasing of hTERT mRNA expression and telomerase activity as well as down-regulation of Survivin expression. To our knowledge, this is the first report about the effects of Tan-I on monocytic leukemia cells.
PMCID: PMC2904915  PMID: 20640151
Tanshinone I (Tan-I); telomerase; survivin; leukemia
4.  Identifying ChIP-seq enrichment using MACS 
Nature protocols  2012;7(9):10.1038/nprot.2012.101.
Model-based Analysis of ChIP-seq (MACS) is a computational algorithm that identifies genome-wide locations of transcription/chromatin factor binding or histone modification from ChIP-seq data. MACS consists of four steps: removing redundant reads, adjusting read position, calculating peak enrichment, and estimating the empirical false discovery rate. In this protocol, we provide a detailed demonstration of how to install MACS and how to use it to analyze three common types of ChIP-seq datasets with different characteristics: the sequence-specific transcription factor FoxA1, the histone modification mark H3K4me3 with sharp enrichment, and the H3K36me3 mark with broad enrichment. We also explain how to interpret and visualize the results of MACS analyses. The algorithm requires approximately 3 GB of RAM and 1.5 hours of computing time to analyze a ChIP-seq dataset containing 30 million reads, an estimate that increases with sequence coverage. MACS is open-source and is available from
PMCID: PMC3868217  PMID: 22936215
MACS; ChIP-seq; peak calling; transcription factor; histone modification
5.  CR Cistrome: a ChIP-Seq database for chromatin regulators and histone modification linkages in human and mouse 
Nucleic Acids Research  2013;42(D1):D450-D458.
Diversified histone modifications (HMs) are essential epigenetic features. They play important roles in fundamental biological processes including transcription, DNA repair and DNA replication. Chromatin regulators (CRs), which are indispensable in epigenetics, can mediate HMs to adjust chromatin structures and functions. With the development of ChIP-Seq technology, there is an opportunity to study CR and HM profiles at the whole-genome scale. However, no specific resource for the integration of CR ChIP-Seq data or CR-HM ChIP-Seq linkage pairs is currently available. Therefore, we constructed the CR Cistrome database, available online at and, to further elucidate CR functions and CR-HM linkages. Within this database, we collected all publicly available ChIP-Seq data on CRs in human and mouse and categorized the data into four cohorts: the reader, writer, eraser and remodeler cohorts, together with curated introductions and ChIP-Seq data analysis results. For the HM readers, writers and erasers, we provided further ChIP-Seq analysis data for the targeted HMs and schematized the relationships between them. We believe CR Cistrome is a valuable resource for the epigenetics community.
PMCID: PMC3965064  PMID: 24253304
6.  Zinc-finger nickase-mediated insertion of the lysostaphin gene into the beta-casein locus in cloned cows 
Nature Communications  2013;4:2565.
Zinc-finger nickases (ZFNickases) are a type of programmable nuclease that can be engineered from zinc-finger nucleases to induce site-specific single-strand breaks or nicks in genomic DNA, which result in homology-directed repair. Although zinc-finger nuclease-mediated gene disruption has been demonstrated in pigs and cattle, they have not been used to target gene addition into an endogenous gene locus in any large domestic species. Here we show in bovine fetal fibroblasts that targeting ZFNickases to the endogenous β-casein (CSN2) locus stimulates lysostaphin gene addition by homology-directed repair. We find that ZFNickase-treated cells can be successfully used in somatic cell nuclear transfer, resulting in live-born gene-targeted cows. Furthermore, the gene-targeted cows secrete lysostaphin in their milk and in vitro assays demonstrate the milk’s ability to kill Staphylococcus aureus. Our success with this strategy will facilitate new transgenic technologies beneficial to both agriculture and biomedicine.
Zinc-finger nickases are programmable nucleases that can be used to generate site-specific single-strand breaks in DNA. Liu et al. use this technology to insert an antimicrobial gene into the endogenous beta-casein locus in cloned cows, with the aim of providing protection against mastitis.
PMCID: PMC3826644  PMID: 24121612
7.  Establishment and Evaluation of a Stable Cattle Type II Alveolar Epithelial Cell Line 
PLoS ONE  2013;8(9):e76036.
Macrophages and dendritic cells are recognized as key players in the defense against mycobacterial infection. Recent research has confirmed that alveolar epithelial cells (AECs) also play important roles against mycobacterium infections. Thus, establishing a stable cattle AEC line for future endogenous immune research on bacterial invasion is necessary. In the present study, we first purified and immortalized type II AECs (AEC II cells) by transfecting them with a plasmid containing the human telomerase reverse trancriptase gene. We then tested whether or not the immortalized cells retained the basic physiological properties of primary AECs by reverse-transcription polymerase chain reaction and Western blot. Finally, we tested the secretion capacity of immortalized AEC II cells upon stimulation by bacterial invasion. The cattle type II alveolar epithelial cell line (HTERT-AEC II) that we established retained lung epithelial cell characteristics: the cells were positive for surfactants A and B, and they secreted tumor necrosis factor-α and interleukin-6 in response to bacterial invasion. Thus, the cell line we established is a potential tool for research on the relationship between AECs and Mycobacterium tuberculosis.
PMCID: PMC3784436  PMID: 24086682
8.  Sequencing of Fifty Human Exomes Reveals Adaptation to High Altitude 
Science (New York, N.Y.)  2010;329(5987):75-78.
Residents of the Tibetan Plateau show heritable adaptations to extreme altitude. We sequenced 50 exomes of ethnic Tibetans, encompassing coding sequences of 92% of human genes, with an average coverage of 18X per individual. Genes showing population-specific allele frequency changes, which represent strong candidates for altitude adaptation, were identified. The strongest signal of natural selection came from EPAS1, a transcription factor involved in response to hypoxia. One SNP at EPAS1 shows a 78% frequency difference between Tibetan and Han samples, representing the fastest allele frequency change observed at any human gene to date. This SNP’s association with erythrocyte abundance supports the role of EPAS1 in adaptation to hypoxia. Thus, a population genomic survey has revealed a functionally important locus in genetic adaptation to high altitude.
PMCID: PMC3711608  PMID: 20595611
9.  CistromeMap: a knowledgebase and web server for ChIP-Seq and DNase-Seq studies in mouse and human 
Bioinformatics  2012;28(10):1411-1412.
Summary: Transcription and chromatin regulators, and histone modifications play essential roles in gene expression regulation. We have created CistromeMap as a web server to provide a comprehensive knowledgebase of all of the publicly available ChIP-Seq and DNase-Seq data in mouse and human. We have also manually curated metadata to ensure annotation consistency, and developed a user-friendly display matrix for quick navigation and retrieval of data for specific factors, cells and papers. Finally, we provide users with summary statistics of ChIP-Seq and DNase-Seq studies.
Availability: Freely available on the web at
PMCID: PMC3348563  PMID: 22495751
10.  A Comprehensive View of Nuclear Receptor Cancer Cistromes 
Cancer research  2011;71(22):6940-6947.
Nuclear receptors (NRs) comprise a superfamily of ligand-activated transcription factors that play important roles in both physiology and diseases including cancer. The technologies of Chromatin ImmunoPrecipitation followed by array hybridization (ChIP-chip) or massively parallel sequencing (ChIP-seq) has been used to map, at an unprecedented rate, the in vivo genome-wide binding (cistrome) of NRs in both normal and cancer cells. We developed a curated database of 88 NR cistrome datasets and other associated high-throughput datasets, including 121 collaborating factor cistromes, 94 epigenomes and 319 transcriptomes. Through integrative analysis of the curated NR ChIP-chip/seq datasets, we discovered novel factor-specific noncanonical motifs that may have important regulatory roles. We also revealed a common feature of NR pioneering factors to recognize relatively short and AT-rich motifs. Most NRs bind predominantly to introns and distal intergenetic regions, and binding sites closer to transcription start sites (TSSs) were found to be neither stronger nor more evolutionarily conserved. Interestingly, while most NRs appear to be predominantly transcriptional activators, our analysis suggests that the binding of ESR1, RARA and RARG has both activating and repressive effects. Through meta-analysis of different omic data of the same cancer cell line model from multiple studies, we generated consensus cistrome and expression profiles. We further made probabilistic predictions of the NR target genes by integrating cistrome and transcriptome data, and validated the predictions using expression data from tumor samples. The final database, with comprehensive cistrome, epigenome, transcriptome datasets, and downstream analysis results, constitutes a valuable resource for the nuclear receptor and cancer community.
PMCID: PMC3610570  PMID: 21940749
11.  Exome Capture Sequencing of Adenoma Reveals Genetic Alterations in Multiple Cellular Pathways at the Early Stage of Colorectal Tumorigenesis 
PLoS ONE  2013;8(1):e53310.
Most of colorectal adenocarcinomas are believed to arise from adenomas, which are premalignant lesions. Sequencing the whole exome of the adenoma will help identifying molecular biomarkers that can predict the occurrence of adenocarcinoma more precisely and help understanding the molecular pathways underlying the initial stage of colorectal tumorigenesis. We performed the exome capture sequencing of the normal mucosa, adenoma and adenocarcinoma tissues from the same patient and sequenced the identified mutations in additional 73 adenomas and 288 adenocarcinomas. Somatic single nucleotide variations (SNVs) were identified in both the adenoma and adenocarcinoma by comparing with the normal control from the same patient. We identified 12 nonsynonymous somatic SNVs in the adenoma and 42 nonsynonymous somatic SNVs in the adenocarcinoma. Most of these mutations including OR6X1, SLC15A3, KRTHB4, RBFOX1, LAMA3, CDH20, BIRC6, NMBR, GLCCI1, EFR3A, and FTHL17 were newly reported in colorectal adenomas. Functional annotation of these mutated genes showed that multiple cellular pathways including Wnt, cell adhesion and ubiquitin mediated proteolysis pathways were altered genetically in the adenoma and that the genetic alterations in the same pathways persist in the adenocarcinoma. CDH20 and LAMA3 were mutated in the adenoma while NRXN3 and COL4A6 were mutated in the adenocarcinoma from the same patient, suggesting for the first time that genetic alterations in the cell adhesion pathway occur as early as in the adenoma. Thus, the comparison of genomic mutations between adenoma and adenocarcinoma provides us a new insight into the molecular events governing the early step of colorectal tumorigenesis.
PMCID: PMC3534699  PMID: 23301059
12.  Curcumin Attenuates Diabetic Neuropathic Pain by Downregulating TNF-α in a Rat Model 
The mechanisms involved in diabetic neuropathic pain are complex and involve peripheral and central pathophysiological phenomena. Proinflammatory tumour necrosis factor α (TNF-α) and TNF-α receptor 1, which are markers of inflammation, contribute to neuropathic pain. The purpose of this experimental study was to evaluate the effect of curcumin on diabetic pain in rats. We tested 24 rats with diabetes induced by a single intraperitoneal injection of streptozotocin and 24 healthy control rats. Twelve rats in each group received 60 mg/kg oral curcumin daily for 28 days, and the other 12 received vehicle. On days 7, 14, 21, and 28, we tested mechanical allodynia with von Frey hairs and thermal hyperalgesia with radiant heat. Markers of inflammation in the spinal cord dorsal horn on day 28 were estimated with a commercial assay and Western blot analysis. Compared to control rats, diabetic rats exhibited increased mean plasma glucose concentration, decreased mean body weight, and significant pain hypersensitivity, as evidenced by decreased paw withdrawal threshold to von Frey hairs and decreased paw withdrawal latency to heat. Curcumin significantly attenuated the diabetes-induced allodynia and hyperalgesia and reduced the expression of both TNF-α and TNF-α receptor 1. Curcumin seems to relieve diabetic hyperalgesia, possibly through an inhibitory action on TNF-α and TNF-α receptor 1.
PMCID: PMC3590595  PMID: 23471081
diabetic neuropathic pain; hyperalgesia; curcumin; tumour necrosis factor α; tumour necrosis factor α receptor 1.
13.  Systematic evaluation of factors influencing ChIP-seq fidelity 
Nature methods  2012;9(6):609-614.
We performed a systematic evaluation of how variations in sequencing depth and other parameters influence interpretation of Chromatin immunoprecipitation (ChIP) followed by sequencing (ChIP-seq) experiments. Using Drosophila S2 cells, we generated ChIP-seq datasets for a site-specific transcription factor (Suppressor of Hairy-wing) and a histone modification (H3K36me3). We detected a chromatin state bias, open chromatin regions yielded higher coverage, which led to false positives if not corrected and had a greater effect on detection specificity than any base-composition bias. Paired-end sequencing revealed that single-end data underestimated ChIP library complexity at high coverage. The removal of reads originating at the same base reduced false-positives while having little effect on detection sensitivity. Even at a depth of ~1 read/bp coverage of mappable genome, ~1% of the narrow peaks detected on a tiling array were missed by ChIP-seq. Evaluation of widely-used ChIP-seq analysis tools suggests that adjustments or algorithm improvements are required to handle datasets with deep coverage.
PMCID: PMC3477507  PMID: 22522655
14.  Direct Sequencing and Characterization of a Clinical Isolate of Epstein-Barr Virus from Nasopharyngeal Carcinoma Tissue by Using Next-Generation Sequencing Technology ▿ ‡  
Journal of Virology  2011;85(21):11291-11299.
Epstein-Barr virus (EBV)-encoded molecules have been detected in the tumor tissues of several cancers, including nasopharyngeal carcinoma (NPC), suggesting that EBV plays an important role in tumorigenesis. However, the nature of EBV with respect to genome width in vivo and whether EBV undergoes clonal expansion in the tumor tissues are still poorly understood. In this study, next-generation sequencing (NGS) was used to sequence DNA extracted directly from the tumor tissue of a patient with NPC. Apart from the human sequences, a clinically isolated EBV genome 164.7 kb in size was successfully assembled and named GD2 (GenBank accession number HQ020558). Sequence and phylogenetic analyses showed that GD2 was closely related to GD1, a previously assembled variant derived from a patient with NPC. GD2 contains the most prevalent EBV variants reported in Cantonese patients with NPC, suggesting that it might be the prevalent strain in this population. Furthermore, GD2 could be grouped into a single subtype according to common classification criteria and contains only 6 heterozygous point mutations, suggesting the monoclonal expansion of GD2 in NPC. This study represents the first genome-wide analysis of a clinical isolate of EBV directly extracted from NPC tissue. Our study reveals that NGS allows the characterization of genome-wide variations of EBV in clinical tumors and provides evidence of monoclonal expansion of EBV in vivo. The pipeline could also be applied to the study of other pathogen-related malignancies. With additional NGS studies of NPC, it might be possible to uncover the potential causative EBV variant involved in NPC.
PMCID: PMC3194977  PMID: 21880770
15.  Cistrome: an integrative platform for transcriptional regulation studies 
Genome Biology  2011;12(8):R83.
The increasing volume of ChIP-chip and ChIP-seq data being generated creates a challenge for standard, integrative and reproducible bioinformatics data analysis platforms. We developed a web-based application called Cistrome, based on the Galaxy open source framework. In addition to the standard Galaxy functions, Cistrome has 29 ChIP-chip- and ChIP-seq-specific tools in three major categories, from preliminary peak calling and correlation analyses to downstream genome feature association, gene expression analyses, and motif discovery. Cistrome is available at
PMCID: PMC3245621  PMID: 21859476
16.  SHP-2 Promotes the Maturation of Oligodendrocyte Precursor Cells Through Akt and ERK1/2 Signaling In Vitro 
PLoS ONE  2011;6(6):e21058.
Oligodendrocyte precursor cells (OPCs) differentiate into oligodendrocytes (OLs), which are responsible for myelination. Myelin is essential for saltatory nerve conduction in the vertebrate nervous system. However, the molecular mechanisms of maturation and myelination by oligodendrocytes remain elusive.
Methods and Findings
In the present study, we showed that maturation of oligodendrocytes was attenuated by sodium orthovanadate (a comprehensive inhibitor of tyrosine phosphatases) and PTPi IV (a specific inhibitor of SHP-2). It is also found that SHP-2 was persistently expressed during maturation process of OPCs. Down-regulation of endogenous SHP-2 led to impairment of oligodendrocytes maturation and this effect was triiodo-L-thyronine (T3) dependent. Furthermore, over-expression of SHP-2 was shown to promote maturation of oligodendrocytes. Finally, it has been identified that SHP-2 was involved in activation of Akt and extracellular-regulated kinases 1 and 2 (ERK1/2) induced by T3 in oligodendrocytes.
SHP-2 promotes oligodendrocytes maturation via Akt and ERK1/2 signaling in vitro.
PMCID: PMC3118803  PMID: 21701583
17.  Alterations of tumor-related genes do not exactly match the histopathological grade in gastric adenocarcinomas 
AIM: To investigate the diverse characteristics of different pathological gradings of gastric adenocarcinoma (GA) using tumor-related genes.
METHODS: GA tissues in different pathological gradings and normal tissues were subjected to tissue arrays. Expressions of 15 major tumor-related genes were detected by RNA in situ hybridization along with 3’ terminal digoxin-labeled anti-sense single stranded oligonucleotide and locked nucleic acid modifying probe within the tissue array. The data obtained were processed by support vector machines by four different feature selection methods to discover the respective critical gene/gene subsets contributing to the GA activities of different pathological gradings.
RESULTS: In comparison of poorly differentiated GA with normal tissues, tumor-related gene TP53 plays a key role, although other six tumor-related genes could also achieve the Area Under Curve (AUC) of the receiver operating characteristic independently by more than 80%. Comparing the well differentiated GA with normal tissues, we found that 11 tumor-related genes could independently obtain the AUC by more than 80%, but only the gene subsets, TP53, RB and PTEN, play a key role. Only the gene subsets, Bcl10, UVRAG, APC, Beclin1, NM23, PTEN and RB could distinguish between the poorly differentiated and well differentiated GA. None of a single gene could obtain a valid distinction.
CONCLUSION: Different from the traditional point of view, the well differentiated cancer tissues have more alterations of important tumor-related genes than the poorly differentiated cancer tissues.
PMCID: PMC2835792  PMID: 20205286
Pathological grading; Gastric adenocarcinoma; Tumor-related gene; Support vector machine; RNA in situ hybridization
18.  Seawater-Regulated Genes for Two-Component Systems and Outer Membrane Proteins in Myxococcus▿ †  
Journal of Bacteriology  2009;191(7):2102-2111.
When salt-tolerant Myxococcus cells are moved to a seawater environment, they change their growth, morphology, and developmental behavior. Outer membrane proteins and signal transduction pathways may play important roles in this shift. Chip hybridization targeting the genes predicted to encode 226 two-component signal transduction pathways and 74 outer membrane proteins of M. xanthus DK1622 revealed that the expression of 55 corresponding genes in the salt-tolerant strain M. fulvus HW-1 was significantly modified (most were downregulated) by the presence of seawater. Sequencing revealed that these seawater-regulated genes are highly homologous in both strains, suggesting that they have similar roles in the lifestyle of Myxococcus. Seven of the genes that had been reported in M. xanthus DK1622 are involved in different cellular processes, such as fruiting body development, sporulation, or motility. The outer membrane (Om) gene Om031 had the most significant change in expression (downregulated) in response to seawater, while the two-component system (Tc) gene Tc105 had the greatest increase in expression. Their homologues MXAN3106 and MXAN4042 were knocked out in DK1622 to analyze their functions in response to changes in salinity. In addition to having increased salt tolerance, sporulation of the MXAN3106 mutant was enhanced compared to that of DK1622, whereas mutating gene MXAN4042 produced contrary results. The results indicated that the genes that are involved in the cellular processes that are significantly changed in response to salinity may also be involved the salt tolerance of Myxococcus cells. Regulating the expression levels of these multifunctional genes may allow cells to quickly and efficiently respond to changing conditions in coastal environments.
PMCID: PMC2655515  PMID: 19151139
19.  Model-based Analysis of ChIP-Seq (MACS) 
Genome Biology  2008;9(9):R137.
MACS performs model-based analysis of ChIP-Seq data generated by short read sequencers.
We present Model-based Analysis of ChIP-Seq data, MACS, which analyzes data generated by short read sequencers such as Solexa's Genome Analyzer. MACS empirically models the shift size of ChIP-Seq tags, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome, allowing for more robust predictions. MACS compares favorably to existing ChIP-Seq peak-finding algorithms, and is freely available.
PMCID: PMC2592715  PMID: 18798982
20.  Genome-wide in silico identification and analysis of cis natural antisense transcripts (cis-NATs) in ten species 
Nucleic Acids Research  2006;34(12):3465-3475.
We developed a fast, integrative pipeline to identify cis natural antisense transcripts (cis-NATs) at genome scale. The pipeline mapped mRNAs and ESTs in UniGene to genome sequences in GoldenPath to find overlapping transcripts and combining information from coding sequence, poly(A) signal, poly(A) tail and splicing sites to deduce transcription orientation. We identified cis-NATs in 10 eukaryotic species, including 7830 candidate sense–antisense (SA) genes in 3915 SA pairs in human. The abundance of SA genes is remarkably low in worm and does not seem to be caused by the prevalence of operons. Hundreds of SA pairs are conserved across different species, even maintaining the same overlapping patterns. The convergent SA class is prevalent in fly, worm and sea squirt, but not in human or mouse as reported previously. The percentage of SA genes among imprinted genes in human and mouse is 24–47%, a range between the two previous reports. There is significant shortage of SA genes on Chromosome X in human and mouse but not in fly or worm, supporting X-inactivation in mammals as a possible cause. SA genes are over-represented in the catalytic activities and basic metabolism functions. All candidate cis-NATs can be downloaded from .
PMCID: PMC1524920  PMID: 16849434
21.  Germline Transmission of an Embryonic Stem Cell Line Derived from BALB/c Cataract Mice 
PLoS ONE  2014;9(3):e90707.
Mice embryonic stem (ES) cells have enabled the generation of mouse strains with defined mutation(s) in their genome for putative disease loci analysis. In the study of cataract, the complex genetic background of this disease and lack of long-term self-renewal ES cells have hampered the functional researches of cataract-related genes. In this study, we aimed to establish ES cells from inherited cataract mice (BALB/CCat/Cat). Embryos of cataract mice were cultured in chemical-defined N2B27 medium with the presence of two small molecules PD0325901 and CHIR99021 (2i) and an ES cell line (named EH-BES) was successfully established. EH-BES showed long-term self-renewal in 2i medium and maintained capacity of germline transmission. Most importantly, the produced chimera and offspring developed congenital cataract as well. Flow cytometry assay revealed that EH-BES are homogeneous in expression of Oct4 and Rex1in 2i medium, which may account for their self-renewal ability. With long-term self-renewal ability and germline-competent, EH-BES cell line can facilitate genetic and functional researches of cataract-related genes and better address mechanisms of cataract.
PMCID: PMC3942454  PMID: 24595217
22.  Improved site-specific recombinase-based method to produce selectable marker- and vector-backbone-free transgenic cells 
Scientific Reports  2014;4:4240.
PhiC31 integrase-mediated gene delivery has been extensively used in gene therapy and animal transgenesis. However, random integration events are observed in phiC31-mediated integration in different types of mammalian cells; as a result, the efficiencies of pseudo attP site integration and evaluation of site-specific integration are compromised. To improve this system, we used an attB-TK fusion gene as a negative selection marker, thereby eliminating random integration during phiC31-mediated transfection. We also excised the selection system and plasmid bacterial backbone by using two other site-specific recombinases, Cre and Dre. Thus, we generated clean transgenic bovine fetal fibroblast cells free of selectable marker and plasmid bacterial backbone. These clean cells were used as donor nuclei for somatic cell nuclear transfer (SCNT), indicating a similar developmental competence of SCNT embryos to that of non-transgenic cells. Therefore, the present gene delivery system facilitated the development of gene therapy and agricultural biotechnology.
PMCID: PMC3937794  PMID: 24577484
23.  Whole Genomic Sequence and Replication Kinetics of a New Enterovirus C96 Isolated from Guangdong, China with a Different Cell Tropism 
PLoS ONE  2014;9(1):e86877.
Enterovirus 96 (EV-C96) is a newly described serotype within the enterovirus C (EV-C) species, and its biological and pathological characters are largely unknown. In this study, we sequenced the whole genome of a novel EV-C96 strain that was isolated in 2011 from a patient with acute flaccid paralysis (AFP) in Guangdong province, China and characterized the properties of its infection. Sequence analysis revealed the close relationship between the EV-C96 strains isolated from the Guangdong and Shandong provinces of China, and suggested that recombination events occurred both between these EV-C96 strains and with other EV-C viruses. Moreover, the virus replication kinetics showed EV-C96 Guangdong strain replicated at a high rate in RD cells and presented a different cell tropism to other strains isolated from Shandong recently. These findings gave further insight into the evolutionary processes and extensive biodiversity of EV-C96.
PMCID: PMC3907579  PMID: 24497989
24.  Genome sequencing and analysis of the paclitaxel-producing endophytic fungus Penicillium aurantiogriseum NRRL 62431 
BMC Genomics  2014;15:69.
Paclitaxel (Taxol™) is an important anticancer drug with a unique mode of action. The biosynthesis of paclitaxel had been considered restricted to the Taxus species until it was discovered in Taxomyces andreanae, an endophytic fungus of T. brevifolia. Subsequently, paclitaxel was found in hazel (Corylus avellana L.) and in several other endophytic fungi. The distribution of paclitaxel in plants and endophytic fungi and the reported sequence homology of key genes in paclitaxel biosynthesis between plant and fungi species raises the question about whether the origin of this pathway in these two physically associated groups could have been facilitated by horizontal gene transfer.
The ability of the endophytic fungus of hazel Penicillium aurantiogriseum NRRL 62431 to independently synthesize paclitaxel was established by liquid chromatography-mass spectrometry and proton nuclear magnetic resonance. The genome of Penicillium aurantiogriseum NRRL 62431 was sequenced and gene candidates that may be involved in paclitaxel biosynthesis were identified by comparison with the 13 known paclitaxel biosynthetic genes in Taxus. We found that paclitaxel biosynthetic gene candidates in P. aurantiogriseum NRRL 62431 have evolved independently and that horizontal gene transfer between this endophytic fungus and its plant host is unlikely.
Our findings shed new light on how paclitaxel-producing endophytic fungi synthesize paclitaxel, and will facilitate metabolic engineering for the industrial production of paclitaxel from fungi.
PMCID: PMC3925984  PMID: 24460898
Penicillium aurantiogriseum NRRL 62431; Paclitaxel; Taxol™; Endophytic fungi; Genome sequence; Horizontal gene transfer
25.  Genome-wide analysis of DNA methylation in bovine placentas 
BMC Genomics  2014;15:12.
DNA methylation is an important epigenetic modification that is essential for epigenetic gene regulation in development and disease. To date, the genome-wide DNA methylation maps of many organisms have been reported, but the methylation pattern of cattle remains unknown.
We showed the genome-wide DNA methylation map in placental tissues using methylated DNA immunoprecipitation combined with high-throughput sequencing (MeDIP-seq). In cattle, the methylation levels in the gene body are relatively high, whereas the promoter remains hypomethylated. We obtained thousands of highly methylated regions (HMRs), methylated CpG islands, and methylated genes from bovine placenta. DNA methylation levels around the transcription start sites of genes are negatively correlated with the gene expression level. However, the relationship between gene-body DNA methylation and gene expression is non-monotonic. Moderately expressed genes generally have the highest levels of gene-body DNA methylation, whereas the highly, and lowly expressed genes, as well as silent genes, show moderate DNA methylation levels. Genes with the highest expression show the lowest DNA methylation levels.
We have generated the genome-wide mapping of DNA methylation in cattle for the first time, and our results can be used for future studies on epigenetic gene regulation in cattle. This study contributes to the knowledge on epigenetics in cattle.
PMCID: PMC3893433  PMID: 24397284

Results 1-25 (96)