PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (62)
 

Clipboard (0)
None

Select a Filter Below

Journals
more »
Year of Publication
1.  PSC: protein surface classification 
Nucleic Acids Research  2012;40(Web Server issue):W435-W439.
We recently proposed to classify proteins by their functional surfaces. Using the structural attributes of functional surfaces, we inferred the pairwise relationships of proteins and constructed an expandable database of protein surface classification (PSC). As the functional surface(s) of a protein is the local region where the protein performs its function, our classification may reflect the functional relationships among proteins. Currently, PSC contains a library of 1974 surface types that include 25 857 functional surfaces identified from 24 170 bound structures. The search tool in PSC empowers users to explore related surfaces that share similar local structures and core functions. Each functional surface is characterized by structural attributes, which are geometric, physicochemical or evolutionary features. The attributes have been normalized as descriptors and integrated to produce a profile for each functional surface in PSC. In addition, binding ligands are recorded for comparisons among homologs. PSC allows users to exploit related binding surfaces to reveal the changes in functionally important residues on homologs that have led to functional divergence during evolution. The substitutions at the key residues of a spatial pattern may determine the functional evolution of a protein. In PSC (http://pocket.uchicago.edu/psc/), a pool of changes in residues on similar functional surfaces is provided.
doi:10.1093/nar/gks495
PMCID: PMC3394246  PMID: 22669905
2.  Increasing MicroRNA Target Prediction Confidence by the Relative R-squared Method 
Journal of theoretical biology  2009;259(4):793-798.
MicroRNAs (miRNAs) are short noncoding RNAs involved in post-transcriptional gene regulation via binding to mRNAs. Studies show that in a multicellular organism microRNAs (miRNAs) downregulate a large number of target mRNAs. However, predicting the target genes of a miRNA is challenging. Microarray expression profiling has been proposed as a complementary method to increase the confidence of miRNA target prediction, but it can become computationally costly or even intractable when many miRNAs and their effects across multiple tissues are to be considered. Here, we propose a statistical method, the relative R2 method, to find high-confidence targets among the set of potential targets predicted by a computational method such as TargetScanS or by microarray analysis, when expression data of both miRNAs and mRNAs are available for multiple tissues. Applying this method to existing data, we obtain many high-confidence targets in mouse.
doi:10.1016/j.jtbi.2009.05.007
PMCID: PMC2744435  PMID: 19463832
microRNA; microarray; regression model; TargetScanS
3.  Improved variance estimators for one- and two-parameter models of nucleotide substitution 
Journal of theoretical biology  2008;254(1):164-167.
The current variance estimators for Jukes and Cantor’s one-parameter model and Kimura’s two-parameter model tend to seriously underestimate the true variances when the proportion of nucleotide differences between the two sequences under study is not small. In this paper, we developed improved variance estimators, using a higher order Taylor expansion and empirical methods. The new estimators outperform the conventional estimators and provide accurate estimates of the true variances.
doi:10.1016/j.jtbi.2008.04.034
PMCID: PMC2580800  PMID: 18571203
substitution model; variance estimator; Taylor expansion; empirical formulas
4.  Protein complexity, gene duplicability and gene dispensability in the yeast genome 
Gene  2006;387(1-2):109-117.
Using functional genomic and protein structural data we studied the effects of protein complexity (here defined as the number of subunit types in a protein) on gene dispensability and gene duplicability. We found that in terms of gene duplicability the major distinction in protein complexity is between hetero-complexes, each of which includes at least two different types of subunits (polypeptides), and homo-complexes, which include monomers and complexes that consist of only subunits of one polypeptide type. However, gene dispensability decreases only gradually as the number of subunit types in a protein complex increases. These observations suggest that the dosage balance hypothesis can explain gene duplicability of complex proteins well, but cannot completely explain the difference in dispensabilities between hetero-complex subunits. It is likely that knocking out a gene coding for a hetero-complex subunit would disrupt the function of the whole complex, so that the deletion effect on fitness would increase with protein complexity. We also found that multi-domain polypeptide genes are less dispensable but more duplicable than single domain polypeptide genes. Duplicate genes derived from the whole genome duplication event in yeast are more dispensable (except for ribosomal protein genes) than other duplicate genes. Further, we found that subunits of the same protein complex tend to have similar expression levels and similar effects of gene deletion on fitness. Finally, we estimated that in yeast the contribution of duplicate genes to genetic robustness against null mutation is ~ 9%, smaller than previously estimated. In yeast, protein complexity may serve as a better indicator of gene dispensability than do duplicate genes.
doi:10.1016/j.gene.2006.08.022
PMCID: PMC2707112  PMID: 17049186
Protein complex; Gene deletion; Fitness effect; Duplicate gene; Protein domain; Whole genome duplication
5.  Evolution of 5′ Untranslated Region Length and Gene Expression Reprogramming in Yeasts 
The sequences of the untranslated regions (UTRs) of mRNAs play important roles in posttranscriptional regulation, but whether a change in UTR length can significantly affect the regulation of gene expression is not clear. In this study, we examined the connection between UTR length and Expression Correlation with cytosolic ribosomal proteins (CRP) genes (ECC), which measures the level of expression similarity of a group of genes with CRP genes under various growth conditions. We used data from the aerobic fermentation yeast Saccharomyces cerevisiae and the aerobic respiration yeast Candida albicans. To reduce statistical fluctuations, we computed the ECC for the genes in a Gene Ontology (GO) functional group. We found that in both species, ECC is strongly correlated with the 5′ UTR length but not with the 3′ UTR length and that the 5′ UTR length is evolutionarily better conserved than the 3′ UTR length. Interestingly, we found 11 GO groups that have had a substantial increase in 5′ UTR length in the S. cerevisiae lineage and that the length increase was associated with a substantial decrease in ECC. Moreover, 9 of the 11 GO groups of genes are involved in mitochondrial respiration function, whose expression reprogramming has been shown to be a major factor for the evolution of aerobic fermentation. Finally, we found that an increase in 5′ UTR length may decrease the +1 nucleosome occupancy. This study provides a new angle to understand the role of 5′ UTR in gene expression regulation and evolution.
doi:10.1093/molbev/msr143
PMCID: PMC3245540  PMID: 21965341
UTR length; gene expression evolution; aerobic fermentation
6.  MicroRNA 3' end nucleotide modification patterns and arm selection preference in liver tissues 
BMC Systems Biology  2012;6(Suppl 2):S14.
Background
The expression of microRNA (miRNA) genes undergoes several maturation steps. Recent studies brought new insights into the maturation process, but also raised debates on the maturation mechanism. To understand the mechanism better, we downloaded small RNA sequence reads from NCBI SRA and quantified the expression profiles of miRNAs in normal and tumor liver tissues.
Results
From these miRNA expression profiles, we studied several issues related to miRNA biogenesis. First of all, the 3' ends of mature miRNAs usually carried modified nucleotides, generated from nucleotide addition or RNA editing. We found that adenine accounted for more than 50% of all miRNA 3' end modification events in all libraries. However, uracil dominated over adenine in several miRNA types. Moreover, the miRNA reads in the HBV-associated libraries have much lower rates of nucleotide modification. These results indicate that miRNA 3' end modifications are miRNA specific and may differ between normal and tumor tissues. Secondly, according to the hydrogen-bonding theory, the expression ratio of 5p arm to 3p arm miRNAs, derived from the same pre-miRNA, should be constant over tissues. However, a comparison of the expression profiles of the 5p arm and 3p arm miRNAs showed that one arm is preferred in the normal liver tissue whereas the other is preferred in the tumor liver tissue. In other words, different liver tissues have their own preferences on selecting either arm to be mature miRNAs.
Conclusions
The results suggest that besides the traditional miRNA biogenesis theory, another mechanism may also participate in the miRNA biogenesis pathways.
doi:10.1186/1752-0509-6-S2-S14
PMCID: PMC3521178  PMID: 23282006
7.  The Relationships Among MicroRNA Regulation, Intrinsically Disordered Regions, and Other Indicators of Protein Evolutionary Rate 
Molecular Biology and Evolution  2011;28(9):2513-2520.
Many indicators of protein evolutionary rate have been proposed, but some of them are interrelated. The purpose of this study is to disentangle their correlations. We assess the strength of each indicator by controlling for the other indicators under study. We find that the number of microRNA (miRNA) types that regulate a gene is the strongest rate indicator (a negative correlation), followed by disorder content (the percentage of disordered regions in a protein, a positive correlation); the strength of disorder content as a rate indicator is substantially increased after controlling for the number of miRNA types. By dividing proteins into lowly and highly intrinsically disordered proteins (L-IDPs and H-IDPs), we find that proteins interacting with more H-IDPs tend to evolve more slowly, which largely explains the previous observation of a negative correlation between the number of protein–protein interactions and evolutionary rate. Moreover, all of the indicators examined here, except for the number of miRNA types, have different strengths in L-IDPs and in H-IDPs. Finally, the number of phosphorylation sites is weakly correlated with the number of miRNA types, and its strength as a rate indicator is substantially reduced when other indicators are considered. Our study reveals the relative strength of each rate indicator and increases our understanding of protein evolution.
doi:10.1093/molbev/msr068
PMCID: PMC3163433  PMID: 21398349
protein evolution; disordered proteins; microRNA regulation; protein–protein interaction; phosphorylation
8.  Transcriptomes of Mouse Olfactory Epithelium Reveal Sexual Differences in Odorant Detection 
Genome Biology and Evolution  2012;4(5):703-712.
To sense numerous odorants and chemicals, animals have evolved a large number of olfactory receptor genes (Olfrs) in their genome. In particular, the house mouse has ∼1,100 genes in the Olfr gene family. This makes the mouse a good model organism to study Olfr genes and olfaction-related genes. To date, whether male and female mice possess the same ability in detecting environmental odorants is still unknown. Using the next generation sequencing technology (paired-end mRNA-seq), we detected 1,088 expressed Olfr genes in both male and female olfactory epithelium. We found that not only Olfr genes but also odorant-binding protein (Obp) genes have evolved rapidly in the mouse lineage. Interestingly, Olfr genes tend to express at a higher level in males than in females, whereas the Obp genes clustered on the X chromosome show the opposite trend. These observations may imply a more efficient odorant-transporting system in females, whereas a more active Olfr gene expressing system in males. In addition, we detected the expression of two genes encoding major urinary proteins, which have been proposed to bind and transport pheromones or act as pheromones in mouse urine. This observation suggests a role of main olfactory system (MOS) in pheromone detection, contrary to the view that only accessory olfactory system (AOS) is involved in pheromone detection. This study suggests the sexual differences in detecting environmental odorants in MOS and demonstrates that mRNA-seq provides a powerful tool for detecting genes with low expression levels and with high sequence similarities.
doi:10.1093/gbe/evs039
PMCID: PMC3381674  PMID: 22511034
mRNA-seq; olfactory epithelium; olfactory receptor; odorant-binding protein; major urinary protein; sexual differentiation
9.  The Evolution of Aerobic Fermentation in Schizosaccharomyces pombe Was Associated with Regulatory Reprogramming but not Nucleosome Reorganization 
Molecular Biology and Evolution  2010;28(4):1407-1413.
Aerobic fermentation has evolved independently in two yeast lineages, the Saccharomyces cerevisiae and the Schizosaccharomyces pombe lineages. In the S. cerevisiae lineage, the evolution of aerobic fermentation was shown to be associated with transcriptional reprogramming of the genes involved in respiration and was recently suggested to be linked to changes in nucleosome occupancy pattern in the promoter regions of respiration-related genes. In contrast, little is known about the genetic basis for the evolution of aerobic fermentation in the Sch. pombe lineage. In particular, it is not known whether respiration-related genes in Sch. pombe have undergone a transcriptional reprogramming or changes in nucleosome occupancy pattern in their promoter regions. In this study, we compared genome-wide gene expression profiles of Sch. pombe with those of S. cerevisiae and the aerobic respiration yeast Candida albicans. We found that the expression profile of respiration-related genes in Sch. pombe is similar to that of S. cerevisiae, but different from that of C. albicans, suggesting that their transcriptional regulation has been reprogrammed during the evolution of aerobic fermentation. However, we found no significant nucleosome organization change in the promoter of respiration-related gene in Sch. pombe.
doi:10.1093/molbev/msq324
PMCID: PMC3058771  PMID: 21127171
aerobic fermentation; nucleosome organization; gene expression; Schizosaccharomyces pombe
10.  Contribution of Transcription Factor Binding Site Motif Variants to Condition-Specific Gene Expression Patterns in Budding Yeast 
PLoS ONE  2012;7(2):e32274.
It is now experimentally well known that variant sequences of a cis transcription factor binding site motif can contribute to differential regulation of genes. We characterize the relationship between motif variants and gene expression by analyzing expression microarray data and binding site predictions. To accomplish this, we statistically detect motif variants with effects that differ among environments. Such environmental specificity may be due to either affinity differences between variants or, more likely, differential interactions of TFs bound to these variants with cofactors, and with differential presence of cofactors across environments. We examine conservation of functional variants across four Saccharomyces species, and find that about a third of transcription factors have target genes that are differentially expressed in a condition-specific manner that is correlated with the nucleotide at variant motif positions. We find good correspondence between our results and some cases in the experimental literature (Reb1, Sum1, Mcm1, and Rap1). These results and growing consensus in the literature indicates that motif variants may often be functionally distinct, that this may be observed in genomic data, and that variants play an important role in condition-specific gene regulation.
doi:10.1371/journal.pone.0032274
PMCID: PMC3285675  PMID: 22384202
11.  Expansion of Hexose Transporter Genes Was Associated with the Evolution of Aerobic Fermentation in Yeasts 
Molecular Biology and Evolution  2010;28(1):131-142.
The genetic basis of organisms’ adaptation to different environments is a central issue of molecular evolution. The budding yeast Saccharomyces cerevisiae and its relatives predominantly ferment glucose into ethanol even in the presence of oxygen. This was suggested to be an adaptation to glucose-rich habitats, but the underlying genetic basis of the evolution of aerobic fermentation remains unclear. In S. cerevisiae, the first step of glucose metabolism is transporting glucose across the plasma membrane, which is carried out by hexose transporter proteins. Although several studies have recognized that the rate of glucose uptake can affect how glucose is metabolized, the role of HXT genes in the evolution of aerobic fermentation has not been fully explored. In this study, we identified all members of the HXT gene family in 23 fully sequenced fungal genomes, reconstructed their evolutionary history to pinpoint gene gain and loss events, and evaluated their adaptive significance in the evolution of aerobic fermentation. We found that the HXT genes have been extensively amplified in the two fungal lineages that have independently evolved aerobic fermentation. In contrast, reduction of the number of HXT genes has occurred in aerobic respiratory species. Our study reveals a strong positive correlation between the copy number of HXT genes and the strength of aerobic fermentation, suggesting that HXT gene expansion has facilitated the evolution of aerobic fermentation.
doi:10.1093/molbev/msq184
PMCID: PMC3002240  PMID: 20660490
aerobic fermentation; HXT; glucose metabolism; glucose transport; adaptive evolution
12.  Revealing the Anti-Tumor Effect of Artificial miRNA p-27-5p on Human Breast Carcinoma Cell Line T-47D 
microRNAs (miRNAs) cause mRNA degradation or translation suppression of their target genes. Previous studies have found direct involvement of miRNAs in cancer initiation and progression. Artificial miRNAs, designed to target single or multiple genes of interest, provide a new therapeutic strategy for cancer. This study investigates the anti-tumor effect of a novel artificial miRNA, miR P-27-5p, on breast cancer. In this study, we reveal that miR P-27-5p downregulates the differential gene expressions associated with the protein modification process and regulation of cell cycle in T-47D cells. Introduction of this novel artificial miRNA, miR P-27-5p, into breast cell lines inhibits cell proliferation and induces the first “gap” phase (G1) cell cycle arrest in cancer cell lines but does not affect normal breast cells. We further show that miR P-27-5p targets the 3′-untranslated mRNA region (3′-UTR) of cyclin-dependent kinase 4 (CDK4) and reduces both the mRNA and protein level of CDK4, which in turn, interferes with phosphorylation of the retinoblastoma protein (RB1). Overall, our data suggest that the effects of miR p-27-5p on cell proliferation and G1 cell cycle arrest are through the downregulation of CDK4 and the suppression of RB1 phosphorylation. This study opens avenues for future therapies targeting breast cancer.
doi:10.3390/ijms13056352
PMCID: PMC3382822  PMID: 22754369
miR P-27-5p; exon array; cyclin-dependent kinase 4; cell cycle; breast cancer; retinoblastoma protein
13.  Phosphorylated and Nonphosphorylated Serine and Threonine Residues Evolve at Different Rates in Mammals 
Molecular Biology and Evolution  2010;27(11):2548-2554.
Protein phosphorylation plays an important role in the regulation of protein function. Phosphorylated residues are generally assumed to be subject to functional constraint, but it has recently been suggested from a comparison of distantly related vertebrate species that most phosphorylated residues evolve at the rates consistent with the surrounding regions. To resolve the controversy, we infer the ancestral phosphoproteome of human and mouse to compare the evolutionary rates of phosphorylated and nonphosphorylated serine (S), threonine (T), and tyrosine (Y) residues. This approach enables accurate estimation of evolutionary rates as it does not assume deep conservation of phosphorylated residues. We show that phosphorylated S/T residues tend to evolve more slowly than nonphosphorylated S/T residues not only in disordered but also in ordered protein regions, indicating evolutionary conservation of phosphorylated S/T residues in mammals. Thus, phosphorylated S/T residues tend to be subject to stronger functional constraint than nonphosphorylated residues regardless of the protein regions in which they reside. In contrast, phosphorylated Y residues evolve at similar rates as nonphosphorylated ones. We also find that the human lineage has gained more phosphorylated T residues and lost fewer phosphorylated Y residues than the mouse lineage. The cause of the gain/loss imbalance remains a mystery but should be worth exploring.
doi:10.1093/molbev/msq142
PMCID: PMC2955733  PMID: 20534707
phosphorylated residue; protein disordered region; evolutionary rate; functional constraint
14.  The genetic basis of evolutionary change in gene expression levels 
The regulation of gene expression is an important determinant of organismal phenotype and evolution. However, the widespread recognition of this fact occurred long after the synthesis of evolution and genetics. Here, we give a brief sketch of thoughts regarding gene regulation in the history of evolution and genetics. We then review the development of genome-wide studies of gene regulatory variation in the context of the location and mode of action of the causative genetic changes. In particular, we review mapping of the genetic basis of expression variation through expression quantitative trait locus studies and measuring the cis/trans component of expression variation in allele-specific expression studies. We conclude by proposing a systematic integration of ideas that combines global mapping studies, cis/trans tests and modern population genetics methodologies, in order to directly estimate the forces acting on regulatory variation within and between species.
doi:10.1098/rstb.2010.0005
PMCID: PMC2935095  PMID: 20643748
gene regulation; expression evolution; allele-specific expression; expression QTL; cis-regulation; trans-regulation
15.  A simple method using PyrosequencingTM to identify de novo SNPs in pooled DNA samples 
Nucleic Acids Research  2010;39(5):e28.
A practical way to reduce the cost of surveying single-nucleotide polymorphism (SNP) in a large number of individuals is to measure the allele frequencies in pooled DNA samples. PyrosequencingTM has been frequently used for this application because signals generated by this approach are proportional to the amount of DNA templates. The PyrosequencingTM pyrogram is determined by the dispensing order of dNTPs, which is usually designed based on the known SNPs to avoid asynchronistic extensions of heterozygous sequences. Therefore, utilizing the pyrogram signals to identify de novo SNPs in DNA pools has never been undertook. Here, in this study we developed an algorithm to address this issue. With the sequence and pyrogram of the wild-type allele known in advance, we could use the pyrogram obtained from the pooled DNA sample to predict the sequence of the unknown mutant allele (de novo SNP) and estimate its allele frequency. Both computational simulation and experimental PyrosequencingTM test results suggested that our method performs well. The web interface of our method is available at http://life.nctu.edu.tw/∼yslin/PSM/.
doi:10.1093/nar/gkq1249
PMCID: PMC3061071  PMID: 21131285
16.  Roles of Trans and Cis Variation in Yeast Intraspecies Evolution of Gene Expression 
Molecular Biology and Evolution  2009;26(11):2533-2538.
Both cis and trans mutations contribute to gene expression divergence within and between species. We used Saccharomyces cerevisiae as a model organism to estimate the relative contributions of cis and trans variations to the expression divergence between a laboratory (BY) and a wild (RM) strain of yeast. We examined whether genes regulated by a single transcription factor (TF; single input module, SIM genes) or genes regulated by multiple TFs (multiple input module, MIM genes) are more susceptible to trans variation. Because a SIM gene is regulated by a single immediate upstream TF, the chance for a change to occur in its trans-acting factors would, on average, be smaller than that for a MIM gene. We chose 232 genes that exhibited expression divergence between BY and RM to test this hypothesis. We examined the expression patterns of these genes in a BY–RM coculture system and in a BY–RM diploid hybrid. We found that trans variation is far more important than cis variation for expression divergence between the two strains. However, because in 75% of the genes studied, cis variation has significantly contributed to expression divergence, cis change also plays a significant role in intraspecific expression evolution. Interestingly, we found that the proportion of genes with diverged expression between BY and RM is larger for MIM genes than for SIM genes; in fact, the proportion tends to increase with the number of transcription factors that regulate the gene. Moreover, MIM genes are, on average, subject to stronger trans effects than SIM genes, though the difference between the two types of genes is not conspicuous.
doi:10.1093/molbev/msp171
PMCID: PMC2767097  PMID: 19648464
cis-regulation; trans-regulation; yeast; expression evolution
17.  The spatial distribution of cis regulatory elements in yeast promoters and its implications for transcriptional regulation 
BMC Genomics  2010;11:581.
Background
How the transcription factor binding sites (TFBSs) are distributed in the promoter region have implications for gene regulation. Previous studies used the translation start codon as the reference point to infer the TFBS distribution. However, it is biologically more relevant to use the transcription start site (TSS) as the reference point. In this study, we reexamined the spatial distribution of TFBSs, investigated various promoter features that may affect the distribution, and studied the effect of TFBS distribution on transcriptional regulation.
Results
We found a sharp peak for the distribution of TFBSs at ~115 bp upstream of the TSS, but no clear peak when the translation start codon was used as the reference point. Our analysis of sequence variation data among 63 yeast strains revealed very low deletion polymorphisms in the region between the distribution peak and the TSS, suggesting that the distances between TFBSs and the TSS have been selectively constrained in evolution. As in previous studies, we found that the nucleosome occupancy and the presence/absence of TATA-box in the promoter region affect the TFBS distribution pattern. In addition, we found that there exists a correlation between the 5'UTR length and the TFBS distribution pattern and we showed that the TFBS distribution pattern affects gene transcription level and plasticity.
Conclusions
The spatial distribution of TFBSs obtained using the TSS as the reference point shows a much sharper peak than does the distribution obtained using the translation start codon as the reference point. The TFBS distribution pattern is affected by nucleosome occupancy and presence of TATA-box and it affects the transcription level and transcription plasticity of the gene.
doi:10.1186/1471-2164-11-581
PMCID: PMC3091728  PMID: 20958978
18.  Functional compensation by duplicated genes in mouse 
Trends in genetics : TIG  2009;25(10):441-442.
doi:10.1016/j.tig.2009.08.001
PMCID: PMC2764834  PMID: 19783063
19.  Functional Compensation of Primary and Secondary Metabolites by Duplicate Genes in Arabidopsis thaliana 
Molecular Biology and Evolution  2010;28(1):377-382.
It is well known that knocking out a gene in an organism often causes no phenotypic effect. One possible explanation is the existence of duplicate genes; that is, the effect of knocking out a gene is compensated by a duplicate copy. Another explanation is the existence of alternative pathways. In terms of metabolic products, the relative roles of the two mechanisms have been extensively studied in yeast but not in any multi-cellular organisms. Here, to address the functional compensation of metabolic products by duplicate genes, we quantified 35 metabolic products from 1,976 genes in knockout mutants of Arabidopsis thaliana by a high-throughput Liquid chromatography-Mass spectrometer (LC-MS) analysis. We found that knocking out either a singleton gene or a duplicate gene with distant paralogs in the genome tends to induce stronger metabolic effects than knocking out a duplicate gene with a close paralog in the genome, indicating that only duplicate genes with close paralogs play a significant role in functional compensation for metabolic products in A. thaliana. To extend the analysis, we examined metabolic products with either high or low connectivity in a metabolic network. We found that the compensatory role of duplicate genes is less important when the metabolite has a high connectivity, indicating that functional compensation by alternative pathways is common in the case of high connectivity. In conclusion, recently duplicated genes play an important role in the compensation of metabolic products only when the number of alternative pathways is small.
doi:10.1093/molbev/msq204
PMCID: PMC3002239  PMID: 20736450
gene duplication; metabolic network; Arabidopsis thaliana; functional compensation; alternative pathway
20.  Lowly Expressed Human MicroRNA Genes Evolve Rapidly 
Molecular Biology and Evolution  2009;26(6):1195-1198.
To study the evolution of human microRNAs (miRNAs), we examined nucleotide variation in humans, sequence divergence between species, and genomic clustering patterns for miRNAs with different expression levels. We found that expression level is a major indicator of the rate of evolution and that ∼30% of currently annotated human miRNA genes are almost free of selective pressure.
doi:10.1093/molbev/msp053
PMCID: PMC2727378  PMID: 19299536
microRNA expression; sequence divergence; spatial clustering; selective pressure
22.  Gene Family Size Conservation Is a Good Indicator of Evolutionary Rates 
Molecular Biology and Evolution  2010;27(8):1750-1758.
The evolution of duplicate genes has been a topic of broad interest. Here, we propose that the conservation of gene family size is a good indicator of the rate of sequence evolution and some other biological properties. By comparing the human–chimpanzee–macaque orthologous gene families with and without family size conservation, we demonstrate that genes with family size conservation evolve more slowly than those without family size conservation. Our results further demonstrate that both family expansion and contraction events may accelerate gene evolution, resulting in elevated evolutionary rates in the genes without family size conservation. In addition, we show that the duplicate genes with family size conservation evolve significantly more slowly than those without family size conservation. Interestingly, the median evolutionary rate of singletons falls in between those of the above two types of duplicate gene families. Our results thus suggest that the controversy on whether duplicate genes evolve more slowly than singletons can be resolved when family size conservation is taken into consideration. Furthermore, we also observe that duplicate genes with family size conservation have the highest level of gene expression/expression breadth, the highest proportion of essential genes, and the lowest gene compactness, followed by singletons and then by duplicate genes without family size conservation. Such a trend accords well with our observations of evolutionary rates. Our results thus point to the importance of family size conservation in the evolution of duplicate genes.
doi:10.1093/molbev/msq055
PMCID: PMC2908708  PMID: 20194423
gene duplication; gene essentiality; gene family size conservation; singleton; primate evolution
23.  Parallel Evolution between Aromatase and Androgen Receptor in the Animal Kingdom 
Molecular Biology and Evolution  2008;26(1):123-129.
There are now many known cases of orthologous or unrelated proteins in different species that have undergone parallel evolution to satisfy a similar function. However, there are no reported cases of parallel evolution for proteins that bind a common ligand but have different functions. We focused on two proteins that have different functions in steroid hormone biosynthesis and action but bind a common ligand, androgen. The first protein, androgen receptor (AR), is a nuclear hormone receptor and the second one, aromatase (cytochrome P450 19 [CYP19]), converts androgen to estrogen. We hypothesized that binding of the androgen ligand has exerted common selective pressure on both AR and CYP19, resulting in a signature of parallel evolution between these two proteins, though they perform different functions. Consistent with this hypothesis, we found that rates of amino acid change in AR and CYP19 are strongly correlated across the metazoan phylogeny, whereas no significant correlation was found in the control set of proteins. Moreover, we inferred that genomic toolkits required for steroid biosynthesis and action were present in a basal metazoan, cnidarians. The close similarities between vertebrate and sea anemone AR and CYP19 suggest a very ancient origin of their endocrine functions at the base of metazoan evolution. Finally, we found evidence supporting the hypothesis that the androgen-to-estrogen ratio determines the gonadal sex in all metazoans.
doi:10.1093/molbev/msn233
PMCID: PMC2721557  PMID: 18936441
androgen receptor; aromatase; steroid hormone; parallel evolution
24.  fPOP: footprinting functional pockets of proteins by comparative spatial patterns 
Nucleic Acids Research  2009;38(Database issue):D288-D295.
fPOP (footprinting Pockets Of Proteins, http://pocket.uchicago.edu/fpop/) is a relational database of the protein functional surfaces identified by analyzing the shapes of binding sites in ∼42 700 structures, including both holo and apo forms. We previously used a purely geometric method to extract the spatial patterns of functional surfaces (split pockets) in ∼19 000 bound structures and constructed a database, SplitPocket (http://pocket.uchicago.edu/). These functional surfaces are now used as spatial templates to predict the binding surfaces of unbound structures. To conduct a shape comparison, we use the Smith–Waterman algorithm to footprint an unbound pocket fragment with those of the functional surfaces in SplitPocket. The pairwise alignment of the unbound and bound pocket fragments is used to evaluate the local structural similarity via geometric matching. The final results of our large-scale computation, including ∼90 000 identified or predicted functional surfaces, are stored in fPOP. This database provides an easily accessible resource for studying functional surfaces, assessing conformational changes between bound and unbound forms and analyzing functional divergence. Moreover, it may facilitate the exploration of the physicochemical textures of molecules and the inference of protein function. Finally, our approach provides a framework for classification of proteins into families on the basis of their functional surfaces.
doi:10.1093/nar/gkp900
PMCID: PMC2808891  PMID: 19880384
25.  Evolutionary Persistence of Functional Compensation by Duplicate Genes in Arabidopsis 
Knocking out a gene from a genome often causes no phenotypic effect. This phenomenon has been explained in part by the existence of duplicate genes. However, it was found that in mouse knockout data duplicate genes are as essential as singleton genes. Here, we study whether it is also true for the knockout data in Arabidopsis. From the knockout data in Arabidopsis thaliana obtained in our study and in the literature, we find that duplicate genes show a significantly lower proportion of knockout effects than singleton genes. Because the persistence of duplicate genes in evolution tends to be dependent on their phenotypic effect, we compared the ages of duplicate genes whose knockout mutants showed less severe phenotypic effects with those with more severe effects. Interestingly, the latter group of genes tends to be more anciently duplicated than the former group of genes. Moreover, using multiple-gene knockout data, we find that functional compensation by duplicate genes for a more severe phenotypic effect tends to be preserved by natural selection for a longer time than that for a less severe effect. Taken together, we conclude that duplicate genes contribute to genetic robustness mainly by preserving compensation for severe phenotypic effects in A. thaliana.
doi:10.1093/gbe/evp043
PMCID: PMC2817435  PMID: 20333209
duplicate; Arabidopsis thaliana; phenotypic effect; functional compensation; selection pressure and genetic robustness

Results 1-25 (62)