We recently proposed to classify proteins by their functional surfaces. Using the structural attributes of functional surfaces, we inferred the pairwise relationships of proteins and constructed an expandable database of protein surface classification (PSC). As the functional surface(s) of a protein is the local region where the protein performs its function, our classification may reflect the functional relationships among proteins. Currently, PSC contains a library of 1974 surface types that include 25 857 functional surfaces identified from 24 170 bound structures. The search tool in PSC empowers users to explore related surfaces that share similar local structures and core functions. Each functional surface is characterized by structural attributes, which are geometric, physicochemical or evolutionary features. The attributes have been normalized as descriptors and integrated to produce a profile for each functional surface in PSC. In addition, binding ligands are recorded for comparisons among homologs. PSC allows users to exploit related binding surfaces to reveal the changes in functionally important residues on homologs that have led to functional divergence during evolution. The substitutions at the key residues of a spatial pattern may determine the functional evolution of a protein. In PSC (http://pocket.uchicago.edu/psc/), a pool of changes in residues on similar functional surfaces is provided.
MicroRNAs (miRNAs) are short noncoding RNAs involved in post-transcriptional gene regulation via binding to mRNAs. Studies show that in a multicellular organism microRNAs (miRNAs) downregulate a large number of target mRNAs. However, predicting the target genes of a miRNA is challenging. Microarray expression profiling has been proposed as a complementary method to increase the confidence of miRNA target prediction, but it can become computationally costly or even intractable when many miRNAs and their effects across multiple tissues are to be considered. Here, we propose a statistical method, the relative R2 method, to find high-confidence targets among the set of potential targets predicted by a computational method such as TargetScanS or by microarray analysis, when expression data of both miRNAs and mRNAs are available for multiple tissues. Applying this method to existing data, we obtain many high-confidence targets in mouse.
microRNA; microarray; regression model; TargetScanS
The current variance estimators for Jukes and Cantor’s one-parameter model and Kimura’s two-parameter model tend to seriously underestimate the true variances when the proportion of nucleotide differences between the two sequences under study is not small. In this paper, we developed improved variance estimators, using a higher order Taylor expansion and empirical methods. The new estimators outperform the conventional estimators and provide accurate estimates of the true variances.
substitution model; variance estimator; Taylor expansion; empirical formulas
Using functional genomic and protein structural data we studied the effects of protein complexity (here defined as the number of subunit types in a protein) on gene dispensability and gene duplicability. We found that in terms of gene duplicability the major distinction in protein complexity is between hetero-complexes, each of which includes at least two different types of subunits (polypeptides), and homo-complexes, which include monomers and complexes that consist of only subunits of one polypeptide type. However, gene dispensability decreases only gradually as the number of subunit types in a protein complex increases. These observations suggest that the dosage balance hypothesis can explain gene duplicability of complex proteins well, but cannot completely explain the difference in dispensabilities between hetero-complex subunits. It is likely that knocking out a gene coding for a hetero-complex subunit would disrupt the function of the whole complex, so that the deletion effect on fitness would increase with protein complexity. We also found that multi-domain polypeptide genes are less dispensable but more duplicable than single domain polypeptide genes. Duplicate genes derived from the whole genome duplication event in yeast are more dispensable (except for ribosomal protein genes) than other duplicate genes. Further, we found that subunits of the same protein complex tend to have similar expression levels and similar effects of gene deletion on fitness. Finally, we estimated that in yeast the contribution of duplicate genes to genetic robustness against null mutation is ~ 9%, smaller than previously estimated. In yeast, protein complexity may serve as a better indicator of gene dispensability than do duplicate genes.
Protein complex; Gene deletion; Fitness effect; Duplicate gene; Protein domain; Whole genome duplication
Histone modification is an important mechanism of gene regulation in eukaryotes. Why many histone modifications can be stably maintained in the midst of genetic and environmental changes is a fundamental question in evolutionary biology. We obtained genome-wide profiles of three histone marks, H3 lysine 4 tri-methylation (H3K4me3), H3 lysine 4 mono-methylation (H3K4me1), and H3 lysine 27 acetylation (H3K27ac), for several cell types from human and mouse. We identified histone modifications that were stable among different cell types in human and histone modifications that were evolutionarily conserved between mouse and human in the same cell type. We found that histone modifications that were stable among cell types were also likely to be conserved between species. This trend was consistently observed in promoter, intronic, and intergenic regions for all of the histone marks tested. Importantly, the trend was observed regardless of the expression breadth of the nearby gene, indicating that slow evolution of housekeeping genes was not the major reason for the correlation. These regions showed distinct genetic and epigenetic properties, such as clustered transcription factor binding sites (TFBSs), high GC content, and CTCF binding at flanking sides. Based on our observations, we proposed that TFBS clustering in or near a histone modification plays a significant role in stabilizing and conserving the histone modification because TFBS clustering promotes TFBS conservation, which in turn promotes histone modification conservation. In summary, the results of this study support the view that in mammalian genomes a common mechanism maintains histone modifications against both genetic and environmental (cellular) changes.
histone modification; transcription factor binding site; evolution of chromatin state
Domestic chickens are excellent models for investigating the genetic basis of phenotypic diversity, as numerous phenotypic changes in physiology, morphology, and behavior in chickens have been artificially selected. Genomic study is required to study genome-wide patterns of DNA variation for dissecting the genetic basis of phenotypic traits. We sequenced the genomes of the Silkie and the Taiwanese native chicken L2 at ∼23- and 25-fold average coverage depth, respectively, using Illumina sequencing. The reads were mapped onto the chicken reference genome (including 5.1% Ns) to 92.32% genome coverage for the two breeds. Using a stringent filter, we identified ∼7.6 million single-nucleotide polymorphisms (SNPs) and 8,839 copy number variations (CNVs) in the mapped regions; 42% of the SNPs have not found in other chickens before. Among the 68,906 SNPs annotated in the chicken sequence assembly, 27,852 were nonsynonymous SNPs located in 13,537 genes. We also identified hundreds of shared and divergent structural and copy number variants in intronic and intergenic regions and in coding regions in the two breeds. Functional enrichments of identified genetic variants were discussed. Radical nsSNP-containing immunity genes were enriched in the QTL regions associated with some economic traits for both breeds. Moreover, genetic changes involved in selective sweeps were detected. From the selective sweeps identified in our two breeds, several genes associated with growth, appetite, and metabolic regulation were identified. Our study provides a framework for genetic and genomic research of domestic chickens and facilitates the domestic chicken as an avian model for genomic, biomedical, and evolutionary studies.
single nucleotide polymorphism; whole genome resequencing; genetic variation; CNVs; chicken
Gene regulation change has long been recognized as an important mechanism for phenotypic evolution. We used the evolution of yeast aerobic fermentation as a model to explore how gene regulation has evolved and how this process has contributed to phenotypic evolution and adaptation. Most eukaryotes fully oxidize glucose to CO2 and H2O in mitochondria to maximize energy yield, whereas some yeasts, such as Saccharomyces cerevisiae and its relatives, predominantly ferment glucose into ethanol even in the presence of oxygen, a phenomenon known as aerobic fermentation. We examined the genome-wide gene expression levels among 12 different yeasts and found that a group of genes involved in the mitochondrial respiration process showed the largest reduction in gene expression level during the evolution of aerobic fermentation. Our analysis revealed that the downregulation of these genes was significantly associated with massive loss of binding motifs of Cbf1p in the fermentative yeasts. Our experimental assays confirmed the binding of Cbf1p to the predicted motif and the activator role of Cbf1p. In summary, our study laid a foundation to unravel the long-time mystery about the genetic basis of evolution of aerobic fermentation, providing new insights into understanding the role of cis-regulatory changes in phenotypic evolution.
yeast; fermentation; cis-regulation; CBF1; evolution; gene expression
Motivation: Histone modifications regulate chromatin structure and gene
expression. Although nucleosome formation is known to be affected by primary DNA sequence
composition, no sequence signature has been identified for histone modifications. It is
known that dense H3K4me3 nucleosome sites are accompanied by a low density of other
nucleosomes and are associated with gene activation. This observation suggests a different
sequence composition of H3K4me3 from other nucleosomes.
Approach: To understand the relationship between genome sequence and
chromatin structure, we studied DNA sequences at histone modification sites in various
human cell types. We found sequence specificity for H3K4me3, but not for other histone
modifications. Using the sequence specificities of H3 and H3K4me3 nucleosomes, we
developed a model that computes the probability of H3K4me3 occupation at each base pair
from the genome sequence context.
Results: A comparison of our predictions with experimental data suggests a
high performance of our method, revealing a strong association between H3K4me3 and
specific genomic DNA context. The high probability of H3K4me3 occupation occurs at
transcription start and termination sites, exon boundaries and binding sites of
transcription regulators involved in chromatin modification activities, including histone
acetylases and enhancer- and insulator-associated factors. Thus, the human genome sequence
contains signatures for chromatin modifications essential for gene regulation and
development. Our method may be applied to find new sequence elements functioning by
Availability: Software and supplementary data are available at
firstname.lastname@example.org or email@example.com
Supplementary data are available at Bioinformatics
The sequences of the untranslated regions (UTRs) of mRNAs play important roles in posttranscriptional regulation, but whether a change in UTR length can significantly affect the regulation of gene expression is not clear. In this study, we examined the connection between UTR length and Expression Correlation with cytosolic ribosomal proteins (CRP) genes (ECC), which measures the level of expression similarity of a group of genes with CRP genes under various growth conditions. We used data from the aerobic fermentation yeast Saccharomyces cerevisiae and the aerobic respiration yeast Candida albicans. To reduce statistical fluctuations, we computed the ECC for the genes in a Gene Ontology (GO) functional group. We found that in both species, ECC is strongly correlated with the 5′ UTR length but not with the 3′ UTR length and that the 5′ UTR length is evolutionarily better conserved than the 3′ UTR length. Interestingly, we found 11 GO groups that have had a substantial increase in 5′ UTR length in the S. cerevisiae lineage and that the length increase was associated with a substantial decrease in ECC. Moreover, 9 of the 11 GO groups of genes are involved in mitochondrial respiration function, whose expression reprogramming has been shown to be a major factor for the evolution of aerobic fermentation. Finally, we found that an increase in 5′ UTR length may decrease the +1 nucleosome occupancy. This study provides a new angle to understand the role of 5′ UTR in gene expression regulation and evolution.
UTR length; gene expression evolution; aerobic fermentation
The expression of microRNA (miRNA) genes undergoes several maturation steps. Recent studies brought new insights into the maturation process, but also raised debates on the maturation mechanism. To understand the mechanism better, we downloaded small RNA sequence reads from NCBI SRA and quantified the expression profiles of miRNAs in normal and tumor liver tissues.
From these miRNA expression profiles, we studied several issues related to miRNA biogenesis. First of all, the 3' ends of mature miRNAs usually carried modified nucleotides, generated from nucleotide addition or RNA editing. We found that adenine accounted for more than 50% of all miRNA 3' end modification events in all libraries. However, uracil dominated over adenine in several miRNA types. Moreover, the miRNA reads in the HBV-associated libraries have much lower rates of nucleotide modification. These results indicate that miRNA 3' end modifications are miRNA specific and may differ between normal and tumor tissues. Secondly, according to the hydrogen-bonding theory, the expression ratio of 5p arm to 3p arm miRNAs, derived from the same pre-miRNA, should be constant over tissues. However, a comparison of the expression profiles of the 5p arm and 3p arm miRNAs showed that one arm is preferred in the normal liver tissue whereas the other is preferred in the tumor liver tissue. In other words, different liver tissues have their own preferences on selecting either arm to be mature miRNAs.
The results suggest that besides the traditional miRNA biogenesis theory, another mechanism may also participate in the miRNA biogenesis pathways.
Many indicators of protein evolutionary rate have been proposed, but some of them are interrelated. The purpose of this study is to disentangle their correlations. We assess the strength of each indicator by controlling for the other indicators under study. We find that the number of microRNA (miRNA) types that regulate a gene is the strongest rate indicator (a negative correlation), followed by disorder content (the percentage of disordered regions in a protein, a positive correlation); the strength of disorder content as a rate indicator is substantially increased after controlling for the number of miRNA types. By dividing proteins into lowly and highly intrinsically disordered proteins (L-IDPs and H-IDPs), we find that proteins interacting with more H-IDPs tend to evolve more slowly, which largely explains the previous observation of a negative correlation between the number of protein–protein interactions and evolutionary rate. Moreover, all of the indicators examined here, except for the number of miRNA types, have different strengths in L-IDPs and in H-IDPs. Finally, the number of phosphorylation sites is weakly correlated with the number of miRNA types, and its strength as a rate indicator is substantially reduced when other indicators are considered. Our study reveals the relative strength of each rate indicator and increases our understanding of protein evolution.
protein evolution; disordered proteins; microRNA regulation; protein–protein interaction; phosphorylation
To sense numerous odorants and chemicals, animals have evolved a large number of olfactory receptor genes (Olfrs) in their genome. In particular, the house mouse has ∼1,100 genes in the Olfr gene family. This makes the mouse a good model organism to study Olfr genes and olfaction-related genes. To date, whether male and female mice possess the same ability in detecting environmental odorants is still unknown. Using the next generation sequencing technology (paired-end mRNA-seq), we detected 1,088 expressed Olfr genes in both male and female olfactory epithelium. We found that not only Olfr genes but also odorant-binding protein (Obp) genes have evolved rapidly in the mouse lineage. Interestingly, Olfr genes tend to express at a higher level in males than in females, whereas the Obp genes clustered on the X chromosome show the opposite trend. These observations may imply a more efficient odorant-transporting system in females, whereas a more active Olfr gene expressing system in males. In addition, we detected the expression of two genes encoding major urinary proteins, which have been proposed to bind and transport pheromones or act as pheromones in mouse urine. This observation suggests a role of main olfactory system (MOS) in pheromone detection, contrary to the view that only accessory olfactory system (AOS) is involved in pheromone detection. This study suggests the sexual differences in detecting environmental odorants in MOS and demonstrates that mRNA-seq provides a powerful tool for detecting genes with low expression levels and with high sequence similarities.
mRNA-seq; olfactory epithelium; olfactory receptor; odorant-binding protein; major urinary protein; sexual differentiation
Aerobic fermentation has evolved independently in two yeast lineages, the Saccharomyces cerevisiae and the Schizosaccharomyces pombe lineages. In the S. cerevisiae lineage, the evolution of aerobic fermentation was shown to be associated with transcriptional reprogramming of the genes involved in respiration and was recently suggested to be linked to changes in nucleosome occupancy pattern in the promoter regions of respiration-related genes. In contrast, little is known about the genetic basis for the evolution of aerobic fermentation in the Sch. pombe lineage. In particular, it is not known whether respiration-related genes in Sch. pombe have undergone a transcriptional reprogramming or changes in nucleosome occupancy pattern in their promoter regions. In this study, we compared genome-wide gene expression profiles of Sch. pombe with those of S. cerevisiae and the aerobic respiration yeast Candida albicans. We found that the expression profile of respiration-related genes in Sch. pombe is similar to that of S. cerevisiae, but different from that of C. albicans, suggesting that their transcriptional regulation has been reprogrammed during the evolution of aerobic fermentation. However, we found no significant nucleosome organization change in the promoter of respiration-related gene in Sch. pombe.
aerobic fermentation; nucleosome organization; gene expression; Schizosaccharomyces pombe
It is now experimentally well known that variant sequences of a cis transcription factor binding site motif can contribute to differential regulation of genes. We characterize the relationship between motif variants and gene expression by analyzing expression microarray data and binding site predictions. To accomplish this, we statistically detect motif variants with effects that differ among environments. Such environmental specificity may be due to either affinity differences between variants or, more likely, differential interactions of TFs bound to these variants with cofactors, and with differential presence of cofactors across environments. We examine conservation of functional variants across four Saccharomyces species, and find that about a third of transcription factors have target genes that are differentially expressed in a condition-specific manner that is correlated with the nucleotide at variant motif positions. We find good correspondence between our results and some cases in the experimental literature (Reb1, Sum1, Mcm1, and Rap1). These results and growing consensus in the literature indicates that motif variants may often be functionally distinct, that this may be observed in genomic data, and that variants play an important role in condition-specific gene regulation.
The genetic basis of organisms’ adaptation to different environments is a central issue of molecular evolution. The budding yeast Saccharomyces cerevisiae and its relatives predominantly ferment glucose into ethanol even in the presence of oxygen. This was suggested to be an adaptation to glucose-rich habitats, but the underlying genetic basis of the evolution of aerobic fermentation remains unclear. In S. cerevisiae, the first step of glucose metabolism is transporting glucose across the plasma membrane, which is carried out by hexose transporter proteins. Although several studies have recognized that the rate of glucose uptake can affect how glucose is metabolized, the role of HXT genes in the evolution of aerobic fermentation has not been fully explored. In this study, we identified all members of the HXT gene family in 23 fully sequenced fungal genomes, reconstructed their evolutionary history to pinpoint gene gain and loss events, and evaluated their adaptive significance in the evolution of aerobic fermentation. We found that the HXT genes have been extensively amplified in the two fungal lineages that have independently evolved aerobic fermentation. In contrast, reduction of the number of HXT genes has occurred in aerobic respiratory species. Our study reveals a strong positive correlation between the copy number of HXT genes and the strength of aerobic fermentation, suggesting that HXT gene expansion has facilitated the evolution of aerobic fermentation.
aerobic fermentation; HXT; glucose metabolism; glucose transport; adaptive evolution
microRNAs (miRNAs) cause mRNA degradation or translation suppression of their target genes. Previous studies have found direct involvement of miRNAs in cancer initiation and progression. Artificial miRNAs, designed to target single or multiple genes of interest, provide a new therapeutic strategy for cancer. This study investigates the anti-tumor effect of a novel artificial miRNA, miR P-27-5p, on breast cancer. In this study, we reveal that miR P-27-5p downregulates the differential gene expressions associated with the protein modification process and regulation of cell cycle in T-47D cells. Introduction of this novel artificial miRNA, miR P-27-5p, into breast cell lines inhibits cell proliferation and induces the first “gap” phase (G1) cell cycle arrest in cancer cell lines but does not affect normal breast cells. We further show that miR P-27-5p targets the 3′-untranslated mRNA region (3′-UTR) of cyclin-dependent kinase 4 (CDK4) and reduces both the mRNA and protein level of CDK4, which in turn, interferes with phosphorylation of the retinoblastoma protein (RB1). Overall, our data suggest that the effects of miR p-27-5p on cell proliferation and G1 cell cycle arrest are through the downregulation of CDK4 and the suppression of RB1 phosphorylation. This study opens avenues for future therapies targeting breast cancer.
miR P-27-5p; exon array; cyclin-dependent kinase 4; cell cycle; breast cancer; retinoblastoma protein
Protein phosphorylation plays an important role in the regulation of protein function. Phosphorylated residues are generally assumed to be subject to functional constraint, but it has recently been suggested from a comparison of distantly related vertebrate species that most phosphorylated residues evolve at the rates consistent with the surrounding regions. To resolve the controversy, we infer the ancestral phosphoproteome of human and mouse to compare the evolutionary rates of phosphorylated and nonphosphorylated serine (S), threonine (T), and tyrosine (Y) residues. This approach enables accurate estimation of evolutionary rates as it does not assume deep conservation of phosphorylated residues. We show that phosphorylated S/T residues tend to evolve more slowly than nonphosphorylated S/T residues not only in disordered but also in ordered protein regions, indicating evolutionary conservation of phosphorylated S/T residues in mammals. Thus, phosphorylated S/T residues tend to be subject to stronger functional constraint than nonphosphorylated residues regardless of the protein regions in which they reside. In contrast, phosphorylated Y residues evolve at similar rates as nonphosphorylated ones. We also find that the human lineage has gained more phosphorylated T residues and lost fewer phosphorylated Y residues than the mouse lineage. The cause of the gain/loss imbalance remains a mystery but should be worth exploring.
phosphorylated residue; protein disordered region; evolutionary rate; functional constraint
The regulation of gene expression is an important determinant of organismal phenotype and evolution. However, the widespread recognition of this fact occurred long after the synthesis of evolution and genetics. Here, we give a brief sketch of thoughts regarding gene regulation in the history of evolution and genetics. We then review the development of genome-wide studies of gene regulatory variation in the context of the location and mode of action of the causative genetic changes. In particular, we review mapping of the genetic basis of expression variation through expression quantitative trait locus studies and measuring the cis/trans component of expression variation in allele-specific expression studies. We conclude by proposing a systematic integration of ideas that combines global mapping studies, cis/trans tests and modern population genetics methodologies, in order to directly estimate the forces acting on regulatory variation within and between species.
gene regulation; expression evolution; allele-specific expression; expression QTL; cis-regulation; trans-regulation
A practical way to reduce the cost of surveying single-nucleotide polymorphism (SNP) in a large number of individuals is to measure the allele frequencies in pooled DNA samples. PyrosequencingTM has been frequently used for this application because signals generated by this approach are proportional to the amount of DNA templates. The PyrosequencingTM pyrogram is determined by the dispensing order of dNTPs, which is usually designed based on the known SNPs to avoid asynchronistic extensions of heterozygous sequences. Therefore, utilizing the pyrogram signals to identify de novo SNPs in DNA pools has never been undertook. Here, in this study we developed an algorithm to address this issue. With the sequence and pyrogram of the wild-type allele known in advance, we could use the pyrogram obtained from the pooled DNA sample to predict the sequence of the unknown mutant allele (de novo SNP) and estimate its allele frequency. Both computational simulation and experimental PyrosequencingTM test results suggested that our method performs well. The web interface of our method is available at http://life.nctu.edu.tw/∼yslin/PSM/.
Both cis and trans mutations contribute to gene expression divergence within and between species. We used Saccharomyces cerevisiae as a model organism to estimate the relative contributions of cis and trans variations to the expression divergence between a laboratory (BY) and a wild (RM) strain of yeast. We examined whether genes regulated by a single transcription factor (TF; single input module, SIM genes) or genes regulated by multiple TFs (multiple input module, MIM genes) are more susceptible to trans variation. Because a SIM gene is regulated by a single immediate upstream TF, the chance for a change to occur in its trans-acting factors would, on average, be smaller than that for a MIM gene. We chose 232 genes that exhibited expression divergence between BY and RM to test this hypothesis. We examined the expression patterns of these genes in a BY–RM coculture system and in a BY–RM diploid hybrid. We found that trans variation is far more important than cis variation for expression divergence between the two strains. However, because in 75% of the genes studied, cis variation has significantly contributed to expression divergence, cis change also plays a significant role in intraspecific expression evolution. Interestingly, we found that the proportion of genes with diverged expression between BY and RM is larger for MIM genes than for SIM genes; in fact, the proportion tends to increase with the number of transcription factors that regulate the gene. Moreover, MIM genes are, on average, subject to stronger trans effects than SIM genes, though the difference between the two types of genes is not conspicuous.
cis-regulation; trans-regulation; yeast; expression evolution
How the transcription factor binding sites (TFBSs) are distributed in the promoter region have implications for gene regulation. Previous studies used the translation start codon as the reference point to infer the TFBS distribution. However, it is biologically more relevant to use the transcription start site (TSS) as the reference point. In this study, we reexamined the spatial distribution of TFBSs, investigated various promoter features that may affect the distribution, and studied the effect of TFBS distribution on transcriptional regulation.
We found a sharp peak for the distribution of TFBSs at ~115 bp upstream of the TSS, but no clear peak when the translation start codon was used as the reference point. Our analysis of sequence variation data among 63 yeast strains revealed very low deletion polymorphisms in the region between the distribution peak and the TSS, suggesting that the distances between TFBSs and the TSS have been selectively constrained in evolution. As in previous studies, we found that the nucleosome occupancy and the presence/absence of TATA-box in the promoter region affect the TFBS distribution pattern. In addition, we found that there exists a correlation between the 5'UTR length and the TFBS distribution pattern and we showed that the TFBS distribution pattern affects gene transcription level and plasticity.
The spatial distribution of TFBSs obtained using the TSS as the reference point shows a much sharper peak than does the distribution obtained using the translation start codon as the reference point. The TFBS distribution pattern is affected by nucleosome occupancy and presence of TATA-box and it affects the transcription level and transcription plasticity of the gene.
It is well known that knocking out a gene in an organism often causes no phenotypic effect. One possible explanation is the existence of duplicate genes; that is, the effect of knocking out a gene is compensated by a duplicate copy. Another explanation is the existence of alternative pathways. In terms of metabolic products, the relative roles of the two mechanisms have been extensively studied in yeast but not in any multi-cellular organisms. Here, to address the functional compensation of metabolic products by duplicate genes, we quantified 35 metabolic products from 1,976 genes in knockout mutants of Arabidopsis thaliana by a high-throughput Liquid chromatography-Mass spectrometer (LC-MS) analysis. We found that knocking out either a singleton gene or a duplicate gene with distant paralogs in the genome tends to induce stronger metabolic effects than knocking out a duplicate gene with a close paralog in the genome, indicating that only duplicate genes with close paralogs play a significant role in functional compensation for metabolic products in A. thaliana. To extend the analysis, we examined metabolic products with either high or low connectivity in a metabolic network. We found that the compensatory role of duplicate genes is less important when the metabolite has a high connectivity, indicating that functional compensation by alternative pathways is common in the case of high connectivity. In conclusion, recently duplicated genes play an important role in the compensation of metabolic products only when the number of alternative pathways is small.
gene duplication; metabolic network; Arabidopsis thaliana; functional compensation; alternative pathway
To study the evolution of human microRNAs (miRNAs), we examined nucleotide variation in humans, sequence divergence between species, and genomic clustering patterns for miRNAs with different expression levels. We found that expression level is a major indicator of the rate of evolution and that ∼30% of currently annotated human miRNA genes are almost free of selective pressure.
microRNA expression; sequence divergence; spatial clustering; selective pressure