Search tips
Search criteria

Results 1-12 (12)

Clipboard (0)

Select a Filter Below

Year of Publication
author:("Su, zhizi")
1.  The Evolutionary Panorama of Organ-Specifically Expressed or Repressed Orthologous Genes in Nine Vertebrate Species 
PLoS ONE  2015;10(2):e0116872.
RNA sequencing (RNA-Seq) technology provides the detailed transcriptomic information for a biological sample. Using the RNA-Seq data of six organs from nine vertebrate species, we identified a number of organ-specifically expressed or repressed orthologous genes whose expression patterns are mostly conserved across nine species. Our analyses show the following results: (i) About 80% of these genes have a chordate or more ancient origin and more than half of them are the legacy of one or multiple rounds of large-scale gene duplication events. (ii) Their evolutionary rates are shaped by the organ in which they are expressed or repressed, e.g. the genes specially expressed in testis and liver generally evolve more than twice as fast as the ones specially expressed in brain and cerebellum. The organ-specific transcription factors were discriminated from these genes. The ChIP-seq data from the ENCODE project also revealed the transcription-related factors that might be involved in regulating human organ-specifically expressed or repressed genes. Some of them are shared by all six human organs. The comparison of ENCODE data with mouse/chicken ChIP-seq data proposes that organ-specifically expressed or repressed orthologous genes are regulated in various combinatorial fashions in different species, although their expression features are conserved among these species. We found that the duplication events in some gene families might help explain the quick organ/tissue divergence in vertebrate lineage. The phylogenetic analysis of testis-specifically expressed genes suggests that some of them are prone to develop new functions for other organs/tissues.
PMCID: PMC4332667  PMID: 25679776
2.  Advances in Computational Genomics 
BioMed Research International  2015;2015:187803.
PMCID: PMC4300036  PMID: 25629039
3.  Effect of Duplicate Genes on Mouse Genetic Robustness: An Update 
BioMed Research International  2014;2014:758672.
In contrast to S. cerevisiae and C. elegans, analyses based on the current knockout (KO) mouse phenotypes led to the conclusion that duplicate genes had almost no role in mouse genetic robustness. It has been suggested that the bias of mouse KO database toward ancient duplicates may possibly cause this knockout duplicate puzzle, that is, a very similar proportion of essential genes (PE) between duplicate genes and singletons. In this paper, we conducted an extensive and careful analysis for the mouse KO phenotype data and corroborated a strong effect of duplicate genes on mouse genetics robustness. Moreover, the effect of duplicate genes on mouse genetic robustness is duplication-age dependent, which holds after ruling out the potential confounding effect from coding-sequence conservation, protein-protein connectivity, functional bias, or the bias of duplicates generated by whole genome duplication (WGD). Our findings suggest that two factors, the sampling bias toward ancient duplicates and very ancient duplicates with a proportion of essential genes higher than that of singletons, have caused the mouse knockout duplicate puzzle; meanwhile, the effect of genetic buffering may be correlated with sequence conservation as well as protein-protein interactivity.
PMCID: PMC4119742  PMID: 25110693
4.  Phylogenomic Distance Method for Analyzing Transcriptome Evolution Based on RNA-seq Data 
Genome Biology and Evolution  2013;5(9):1746-1753.
Thanks to the microarray technology, our understanding of transcriptome evolution at the genome level has been considerably advanced in the past decade. Yet, further investigation was challenged by several technical limitations of this technology. Recent innovation of next-generation sequencing, particularly the invention of RNA-seq technology, has shed insightful lights on resolving this problem. Though a number of statistical and computational methods have been developed to analyze RNA-seq data, the analytical framework specifically designed for evolutionary genomics remains an open question. In this article we develop a new method for estimating the genome expression distance from the RNA-seq data, which has explicit interpretations under the model of gene expression evolution. Moreover, this distance measure takes the data overdispersion, gene length variation, and sequencing depth variation into account so that it can be applied to multiple genomes from different species. Using mammalian RNA-seq data as example, we demonstrated that this expression distance is useful in phylogenomic analysis.
PMCID: PMC3787673  PMID: 23940099
transcriptome evolution; RNA-seq; genome expression distance
5.  Identification of Functional Mutations in GATA4 in Patients with Congenital Heart Disease 
PLoS ONE  2013;8(4):e62138.
Congenital heart disease (CHD) is one of the most prevalent developmental anomalies and the leading cause of noninfectious morbidity and mortality in newborns. Despite its prevalence and clinical significance, the etiology of CHD remains largely unknown. GATA4 is a highly conserved transcription factor that regulates a variety of physiological processes and has been extensively studied, particularly on its role in heart development. With the combination of TBX5 and MEF2C, GATA4 can reprogram postnatal fibroblasts into functional cardiomyocytes directly. In the past decade, a variety of GATA4 mutations were identified and these findings originally came from familial CHD pedigree studies. Given that familial and sporadic CHD cases allegedly share a basic genetic basis, we explore the GATA4 mutations in different types of CHD. In this study, via direct sequencing of the GATA4 coding region and exon-intron boundaries in 384 sporadic Chinese CHD patients, we identified 12 heterozygous non-synonymous mutations, among which 8 mutations were only found in CHD patients when compared with 957 controls. Six of these non-synonymous mutations have not been previously reported. Subsequent functional analyses revealed that the transcriptional activity, subcellular localization and DNA binding affinity of some mutant GATA4 proteins were significantly altered. Our results expand the spectrum of GATA4 mutations linked to cardiac defects. Together with the newly reported mutations, approximately 110 non-synonymous mutations have currently been identified in GATA4. Our future analysis will explore why the evolutionarily conserved GATA4 appears to be hypermutable.
PMCID: PMC3633926  PMID: 23626780
6.  Histone modification pattern evolution after yeast gene duplication 
Gene duplication and subsequent functional divergence especially expression divergence have been widely considered as main sources for evolutionary innovations. Many studies evidenced that genetic regulatory network evolved rapidly shortly after gene duplication, thus leading to accelerated expression divergence and diversification. However, little is known whether epigenetic factors have mediated the evolution of expression regulation since gene duplication. In this study, we conducted detailed analyses on yeast histone modification (HM), the major epigenetics type in this organism, as well as other available functional genomics data to address this issue.
Duplicate genes, on average, share more common HM-code patterns than random singleton pairs in their promoters and open reading frames (ORF). Though HM-code divergence between duplicates in both promoter and ORF regions increase with their sequence divergence, the HM-code in ORF region evolves slower than that in promoter region, probably owing to the functional constraints imposed on protein sequences. After excluding the confounding effect of sequence divergence (or evolutionary time), we found the evidence supporting the notion that in yeast, the HM-code may co-evolve with cis- and trans-regulatory factors. Moreover, we observed that deletion of some yeast HM-related enzymes increases the expression divergence between duplicate genes, yet the effect is lower than the case of transcription factor (TF) deletion or environmental stresses.
Our analyses demonstrate that after gene duplication, yeast histone modification profile between duplicates diverged with evolutionary time, similar to genetic regulatory elements. Moreover, we found the evidence of the co-evolution between genetic and epigenetic elements since gene duplication, together contributing to the expression divergence between duplicate genes.
PMCID: PMC3495647  PMID: 22776110
Histone modification; Histone modification code divergence; Gene duplication; Expression divergence; Epigenetic divergence; cis-regulation; trans-regulation
7.  Conservation and divergence of DNA methylation in eukaryotes 
Epigenetics  2011;6(2):134-140.
DNA methylation is one of the most important heritable epigenetic modifications of the genome and is involved in the regulation of many cellular processes. Aberrant DNA methylation has been frequently reported to influence gene expression and subsequently cause various human diseases, including cancer. Recent rapid advances in next-generation sequencing technologies have enabled investigators to profile genome methylation patterns at singlebase resolution. Remarkably, more than 20 eukaryotic methylomes have been generated thus far, with a majority published since November 2009. Analysis of this vast amount of data has dramatically enriched our knowledge of biological function, conservation and divergence of DNA methylation in eukaryotes. Even so, many specific functions of DNA methylation and their underlying regulatory systems still remain unknown to us. Here, we briefly introduce current approaches for DNA methylation profiling and then systematically review the features of whole genome DNA methylation patterns in eight animals, six plants and five fungi. Our systematic comparison provides new insights into the conservation and divergence of DNA methylation in eukaryotes and their regulation of gene expression. This work aims to summarize the current state of available methylome data and features informatively.
PMCID: PMC3278781  PMID: 20962593
DNA methylation; methylome; single-base resolution; CpG; gene body; broadness; deepness; promoter
8.  Functional complementation between transcriptional methylation regulation and post-transcriptional microRNA regulation in the human genome 
BMC Genomics  2011;12(Suppl 5):S15.
DNA methylation in the 5' promoter regions of genes and microRNA (miRNA) regulation at the 3' untranslated regions (UTRs) are two major epigenetic regulation mechanisms in most eukaryotes. Both DNA methylation and miRNA regulation can suppress gene expression and their corresponding protein product; thus, they play critical roles in cellular processes. Although there have been numerous investigations of gene regulation by methylation changes and miRNAs, there is no systematic genome-wide examination of their coordinated effects in any organism.
In this study, we investigated the relationship between promoter methylation at the transcription level and miRNA regulation at the post-transcription level by taking advantage of recently released human methylome data and high quality miRNA and other gene annotation data. We found methylation level in the promoter regions and expression level was negatively correlated. Then, we showed that miRNAs tended to target the genes with a low DNA methylation level in their promoter regions. We further demonstrated that this observed pattern was not attributed to the gene expression level, expression broadness, or the number of transcription factor binding sites. Interestingly, we found miRNA target sites were significantly enriched in the genes located in differentially methylated regions or partially methylated domains. Finally, we explored the features of DNA methylation and miRNA regulation in cancer genes and found cancer genes tended to have low methylation level and more miRNA target sites.
This is the first genome-wide investigation of the combined regulation of gene expression. Our results supported a complementary regulation between DNA methylation (transcriptional level) and miRNA function (post-transcriptional level) in the human genome. The results were helpful for our understanding of the evolutionary forces towards organisms' complexity beyond traditional sequence level investigation.
PMCID: PMC3287497  PMID: 22369656
9.  Differences in duplication age distributions between human GPCRs and their downstream genes from a network prospective 
BMC Genomics  2009;10(Suppl 1):S14.
How gene duplication has influenced the evolution of gene networks is one of the core problems in evolution. Current duplication-divergence theories generally suggested that genes on the periphery of the networks were preferentially retained after gene duplication. However, previous studies were mostly based on gene networks in invertebrate species, and they had the inherent shortcoming of not being able to provide information on how the duplication-divergence process proceeded along the time axis during major speciation events.
In this study, we constructed a model system consisting of human G protein-coupled receptors (GPCRs) and their downstream genes in the GPCR pathways. These two groups of genes offered a natural partition of genes in the peripheral and the backbone layers of the network. Analysis of the age distributions of the duplication events in human GPCRs and "downstream genes" gene families indicated that they both experienced an explosive expansion at the time of early vertebrate emergence. However, we found only GPCR families saw a continued expansion after early vertebrates, mostly prominently in several small subfamilies of GPCRs involved in immune responses and sensory responses.
In general, in the human GPCR model system, we found that the position of a gene in the gene networks has significant influences on the likelihood of fixation of its duplicates. However, for a super gene family, the influence was not uniform among subfamilies. For super families, such as GPCRs, whose gene basis of expression diversity was well established at early vertebrates, continued expansions were mostly prominent in particular small subfamilies mainly involved in lineage-specific functions.
PMCID: PMC2709257  PMID: 19594873
10.  Evolution of the class C GPCR Venus flytrap modules involved positive selected functional divergence 
Class C G protein-coupled receptors (GPCRs) represent a distinct group of the GPCR family, which structurally possess a characteristically distinct extracellular domain inclusive of the Venus flytrap module (VFTM). The VFTMs of the class C GPCRs is responsible for ligand recognition and binding, and share sequence similarity with bacterial periplasmic amino acid binding proteins (PBPs). An extensive phylogenetic investigation of the VFTMs was conducted by analyzing for functional divergence and testing for positive selection for five typical groups of the class C GPCRs. The altered selective constraints were determined to identify the sites that had undergone functional divergence via positive selection. In order to structurally demonstrate the pattern changes during the evolutionary process, three-dimensional (3D) structures of the GPCR VFTMs were modelled and reconstructed from ancestral VFTMs.
Our results show that the altered selective constraints in the VFTMs of class C GPCRs are statistically significant. This implies that functional divergence played a key role in characterizing the functions of the VFTMs after gene duplication events. Meanwhile, positive selection is involved in the evolutionary process and drove the functional divergence of the VFTMs. Our results also reveal that three continuous duplication events occurred in order to shape the evolutionary topology of class C GPCRs. The five groups of the class C GPCRs have essentially different sites involved in functional divergence, which would have shaped the specific structures and functions of the VFTMs.
Taken together, our results show that functional divergence involved positive selection and is partially responsible for the evolutionary patterns of the class C GPCR VFTMs. The sites involved in functional divergence will provide more clues and candidates for further research on structural-function relationships of these modules as well as shedding light on the activation mechanism of the class C GPCRs.
PMCID: PMC2670285  PMID: 19323848
11.  Web-based resources for comparative genomics 
Human Genomics  2005;2(3):187-190.
The available web-based genome data and related resources provide great opportunities for biomedical scientists to identify functional elements in a particular genome region or to explore the evolutionary pattern of genome dynamics. Comparative genomics is an indispensable tool for achieving these goals. Because of the broad scope of comparative genomics, it is difficult to address all of its aspects in short survey. A few currently 'hot' topics have therefore been selected and a brief review of the availability of web-based databases software is given.
PMCID: PMC3525128  PMID: 16197736
comparative genomics; software; web-based database
12.  The Genomes of Oryza sativa: A History of Duplications 
Yu, Jun | Wang, Jun | Lin, Wei | Li, Songgang | Li, Heng | Zhou, Jun | Ni, Peixiang | Dong, Wei | Hu, Songnian | Zeng, Changqing | Zhang, Jianguo | Zhang, Yong | Li, Ruiqiang | Xu, Zuyuan | Li, Shengting | Li, Xianran | Zheng, Hongkun | Cong, Lijuan | Lin, Liang | Yin, Jianning | Geng, Jianing | Li, Guangyuan | Shi, Jianping | Liu, Juan | Lv, Hong | Li, Jun | Wang, Jing | Deng, Yajun | Ran, Longhua | Shi, Xiaoli | Wang, Xiyin | Wu, Qingfa | Li, Changfeng | Ren, Xiaoyu | Wang, Jingqiang | Wang, Xiaoling | Li, Dawei | Liu, Dongyuan | Zhang, Xiaowei | Ji, Zhendong | Zhao, Wenming | Sun, Yongqiao | Zhang, Zhenpeng | Bao, Jingyue | Han, Yujun | Dong, Lingli | Ji, Jia | Chen, Peng | Wu, Shuming | Liu, Jinsong | Xiao, Ying | Bu, Dongbo | Tan, Jianlong | Yang, Li | Ye, Chen | Zhang, Jingfen | Xu, Jingyi | Zhou, Yan | Yu, Yingpu | Zhang, Bing | Zhuang, Shulin | Wei, Haibin | Liu, Bin | Lei, Meng | Yu, Hong | Li, Yuanzhe | Xu, Hao | Wei, Shulin | He, Ximiao | Fang, Lijun | Zhang, Zengjin | Zhang, Yunze | Huang, Xiangang | Su, Zhixi | Tong, Wei | Li, Jinhong | Tong, Zongzhong | Li, Shuangli | Ye, Jia | Wang, Lishun | Fang, Lin | Lei, Tingting | Chen, Chen | Chen, Huan | Xu, Zhao | Li, Haihong | Huang, Haiyan | Zhang, Feng | Xu, Huayong | Li, Na | Zhao, Caifeng | Li, Shuting | Dong, Lijun | Huang, Yanqing | Li, Long | Xi, Yan | Qi, Qiuhui | Li, Wenjie | Zhang, Bo | Hu, Wei | Zhang, Yanling | Tian, Xiangjun | Jiao, Yongzhi | Liang, Xiaohu | Jin, Jiao | Gao, Lei | Zheng, Weimou | Hao, Bailin | Liu, Siqi | Wang, Wen | Yuan, Longping | Cao, Mengliang | McDermott, Jason | Samudrala, Ram | Wang, Jian | Wong, Gane Ka-Shu | Yang, Huanming
PLoS Biology  2005;3(2):e38.
We report improved whole-genome shotgun sequences for the genomes of indica and japonica rice, both with multimegabase contiguity, or almost 1,000-fold improvement over the drafts of 2002. Tested against a nonredundant collection of 19,079 full-length cDNAs, 97.7% of the genes are aligned, without fragmentation, to the mapped super-scaffolds of one or the other genome. We introduce a gene identification procedure for plants that does not rely on similarity to known genes to remove erroneous predictions resulting from transposable elements. Using the available EST data to adjust for residual errors in the predictions, the estimated gene count is at least 38,000–40,000. Only 2%–3% of the genes are unique to any one subspecies, comparable to the amount of sequence that might still be missing. Despite this lack of variation in gene content, there is enormous variation in the intergenic regions. At least a quarter of the two sequences could not be aligned, and where they could be aligned, single nucleotide polymorphism (SNP) rates varied from as little as 3.0 SNP/kb in the coding regions to 27.6 SNP/kb in the transposable elements. A more inclusive new approach for analyzing duplication history is introduced here. It reveals an ancient whole-genome duplication, a recent segmental duplication on Chromosomes 11 and 12, and massive ongoing individual gene duplications. We find 18 distinct pairs of duplicated segments that cover 65.7% of the genome; 17 of these pairs date back to a common time before the divergence of the grasses. More important, ongoing individual gene duplications provide a never-ending source of raw material for gene genesis and are major contributors to the differences between members of the grass family.
Comparative genome sequencing of indica and japonica rice reveals that duplication of genes and genomic regions has played a major part in the evolution of grass genomes
PMCID: PMC546038  PMID: 15685292

Results 1-12 (12)