QBRICK facilitates the integrin α8β1–dependent interactions of cells with basement membranes by regulating the basement membrane assembly of nephronectin.
Dysfunction of the basement membrane protein QBRICK provokes Fraser syndrome, which results in renal dysmorphogenesis, cryptophthalmos, syndactyly, and dystrophic epidermolysis bullosa through unknown mechanisms. Here, we show that integrin α8β1 binding to basement membranes was significantly impaired in Qbrick-null mice. This impaired integrin α8β1 binding was not a direct consequence of the loss of QBRICK, which itself is a ligand of integrin α8β1, because knock-in mice with a mutation in the integrin-binding site of QBRICK developed normally and do not exhibit any defects in integrin α8β1 binding. Instead, the loss of QBRICK significantly diminished the expression of nephronectin, an integrin α8β1 ligand necessary for renal development. In vivo, nephronectin associated with QBRICK and localized at the sublamina densa region, where QBRICK was also located. Collectively, these findings indicate that QBRICK facilitates the integrin α8β1–dependent interactions of cells with basement membranes by regulating the basement membrane assembly of nephronectin and explain why renal defects occur in Fraser syndrome.
Gene sequences are routinely used to determine the topologies of unrooted phylogenetic trees, but many of the most important questions in evolution require knowing both the topologies and the roots of trees. However, general algorithms for calculating rooted trees from gene and genomic sequences in the absence of gene paralogs are few. Using the principles of evolutionary parsimony (EP) (Lake JA. 1987a. A rate-independent technique for analysis of nucleic acid sequences: evolutionary parsimony. Mol Biol Evol. 4:167–181) and its extensions (Cavender, J. 1989. Mechanized derivation of linear invariants. Mol Biol Evol. 6:301–316; Nguyen T, Speed TP. 1992. A derivation of all linear invariants for a nonbalanced transversion model. J Mol Evol. 35:60–76), we explicitly enumerate all linear invariants that solely contain rooting information and derive algorithms for rooting gene trees directly from gene and genomic sequences. These new EP linear rooting invariants allow one to determine rooted trees, even in the complete absence of outgroups and gene paralogs. EP rooting invariants are explicitly derived for three taxon trees, and rules for their extension to four or more taxa are provided. The method is demonstrated using 18S ribosomal DNA to illustrate how the new animal phylogeny (Aguinaldo AMA et al. 1997. Evidence for a clade of nematodes, arthropods, and other moulting animals. Nature 387:489–493; Lake JA. 1990. Origin of the metazoa. Proc Natl Acad Sci USA 87:763–766) may be rooted directly from sequences, even when they are short and paralogs are unavailable. These results are consistent with the current root (Philippe H et al. 2011. Acoelomorph flatworms are deuterostomes related to Xenoturbella. Nature 470:255–260).
rooting; trees; gene sequences; evolutionary parsimony; metazoa; Bayesian statistics; linear invariants
Fraser syndrome (FS) features renal agenesis and cystic kidneys. Mutations of FRAS1 (Fraser syndrome 1) and FREM2 (FRAS1-related extracellular matrix protein 2) cause FS. They code for basement membrane proteins expressed in metanephric epithelia where they mediate epithelial/mesenchymal signalling. Little is known about whether and where these molecules are expressed in more mature kidneys.
In healthy and congenital polycystic kidney (cpk) mouse kidneys we sought Frem2 expression using a LacZ reporter gene and quantified Fras family transcripts. Fras1 immunohistochemistry was undertaken in cystic kidneys from cpk mice and PCK (Pkhd1 mutant) rats (models of autosomal recessive polycystic kidney disease) and in wild-type metanephroi rendered cystic by dexamethasone.
Nascent nephrons transiently expressed Frem2 in both tubule and podocyte epithelia. Maturing and adult collecting ducts also expressed Frem2. Frem2 was expressed in cpk cystic epithelia although Frem2 haploinsufficiency did not significantly modify cystogenesis in vivo. Fras1 transcripts were significantly upregulated, and Frem3 downregulated, in polycystic kidneys versus the non-cystic kidneys of littermates. Fras1 was immunodetected in cpk, PCK and dexamethasone-induced cyst epithelia.
These descriptive results are consistent with the hypothesis that Fras family molecules play diverse roles in kidney epithelia. In future, this should be tested by conditional deletion of FS genes in nephron segments and collecting ducts.
Basement membrane; Cyst; Development; Fras1; Frem2; LacZ; Reporter gene; Medicine & Public Health; Pediatrics
Using genomic analysis, researchers previously identified genes coding for proteins homologous to the structural proteins of nitrogenase (J. Raymond, J. L. Siefert, C. R. Staples, and R. E. Blankenship, Mol. Biol. Evol. 21:541-554, 2004). The expression and association of NifD and NifH nitrogenase homologs (named NflD and NflH for “Nif-like” D and H, respectively) have been detected in a non-nitrogen-fixing hyperthermophilic methanogen, Methanocaldococcus jannaschii. These homologs are expressed constitutively and do not appear to be directly involved with nitrogen metabolism or detoxification of compounds such as cyanide or azide. The NflH and NflD proteins were found to interact with each other, as determined by bacterial two-hybrid studies. Upon immunoisolation, NflD and NflH copurified, along with three other proteins whose functions are as yet uncharacterized. The apparent presence of genes coding for NflH and NflD in all known methanogens, their constitutive expression, and their high sequence similarity to the NifH and NifD proteins or the BchL and BchN/BchB proteins suggest that NflH and NflD participate in an indispensable and fundamental function(s) in methanogens.
Several studies in Drosophila have shown excessive movement of retrogenes from the X chromosome to autosomes, and that these genes are frequently expressed in the testis. This phenomenon has led to several hypotheses invoking natural selection as the process driving male-biased genes to the autosomes. Metta and Schlötterer (BMC Evol Biol 2010, 10:114) analyzed a set of retrogenes where the parental gene has been subsequently lost. They assumed that this class of retrogenes replaced the ancestral functions of the parental gene, and reported that these retrogenes, although mostly originating from movement out of the X chromosome, showed female-biased or unbiased expression. These observations led the authors to suggest that selective forces (such as meiotic sex chromosome inactivation and sexual antagonism) were not responsible for the observed pattern of retrogene movement out of the X chromosome.
We reanalyzed the dataset published by Metta and Schlötterer and found several issues that led us to a different conclusion. In particular, Metta and Schlötterer used a dataset combined with expression data in which significant sex-biased expression is not detectable. First, the authors used a segmental dataset where the genes selected for analysis were less testis-biased in expression than those that were excluded from the study. Second, sex-biased expression was defined by comparing male and female whole-body data and not the expression of these genes in gonadal tissues. This approach significantly reduces the probability of detecting sex-biased expressed genes, which explains why the vast majority of the genes analyzed (parental and retrogenes) were equally expressed in both males and females. Third, the female-biased expression observed by Metta and Schlötterer is mostly found for parental genes located on the X chromosome, which is known to be enriched with genes with female-biased expression. Fourth, using additional gonad expression data, we found that autosomal genes analyzed by Metta and Schlötterer are less up regulated in ovaries and have higher chance to be expressed in meiotic cells of spermatogenesis when compared to X-linked genes.
The criteria used to select retrogenes and the sex-biased expression data based on whole adult flies generated a segmental dataset of female-biased and unbiased expressed genes that was unable to detect the higher propensity of autosomal retrogenes to be expressed in males. Thus, there is no support for the authors’ view that the movement of new retrogenes, which originated from X-linked parental genes, was not driven by selection. Therefore, selection-based genetic models remain the most parsimonious explanations for the observed chromosomal distribution of retrogenes.
Evolutionary adaptation affects demographic resilience to climate change but few studies have attempted to project changes in selective pressures or quantify impacts of trait responses on population dynamics and extinction risk. We used a novel individual-based model to explore potential evolutionary changes in migration timing and the consequences for population persistence in sockeye salmon Oncorhynchus nerka in the Fraser River, Canada, under scenarios of future climate warming. Adult sockeye salmon are highly sensitive to increases in water temperature during their arduous upriver migration, raising concerns about the fate of these ecologically, culturally, and commercially important fish in a warmer future. Our results suggest that evolution of upriver migration timing could allow these salmon to avoid increasingly frequent stressful temperatures, with the odds of population persistence increasing in proportion to the trait heritability and phenotypic variance. With a simulated 2°C increase in average summer river temperatures by 2100, adult migration timing from the ocean to the river advanced by ∼10 days when the heritability was 0.5, while the risk of quasi-extinction was only 17% of that faced by populations with zero evolutionary potential (i.e., heritability fixed at zero). The rates of evolution required to maintain persistence under simulated scenarios of moderate to rapid warming are plausible based on estimated heritabilities and rates of microevolution of timing traits in salmon and related species, although further empirical work is required to assess potential genetic and ecophysiological constraints on phenological adaptation. These results highlight the benefits to salmon management of maintaining evolutionary potential within populations, in addition to conserving key habitats and minimizing additional stressors where possible, as a means to build resilience to ongoing climate change. More generally, they demonstrate the importance and feasibility of considering evolutionary processes, in addition to ecology and demography, when projecting population responses to environmental change.
The major sperm protein (MSP) of the nematode Caenorhabditis elegans is a low-molecular-weight (15,000) basic protein implicated in the pseudopodial movement of mature spermatozoa. Its synthesis occurs in a specific region of the gonad and is regulated at the level of transcription (M. Klass and D. Hirsh, Dev. Biol. 84:299-312, 1981; S. Ward and M. Klass, Dev. Biol. 92:203-208, 1982; Klass et al., Dev. Biol. 93:152-164, 1982). A developmentally regulated gene family has been identified that codes for this MSP. Whole genomic blots, as well as analysis of genomic clone banks, indicate that there are between 15 and 25 copies of the MSP gene in the nematode genome. Southern blot analysis also indicates that there is no rearrangement or amplification within the MSP gene family during development. No evidence was found of methylation at various restriction sites surrounding the MSP gene family, and similarly, no correlation between methylation and expression was observed. Three distinct members of this MSP gene family have been cloned, and their nucleotide sequences have been determined. Differential screening of a cDNA clone bank made from polyadenylated mRNA from adult males yielded 45 male-specific clones, 32 of which were clones of MSP genes. One of these cDNA clones was found to contain the entire nucleotide sequence for the MSP, including part of the 5' leader and all of the 3' trailing sequence. Genomic clones bearing copies of the MSP genes have been isolated. At least one of the members of this gene family is a pseudogene. Another member of the MSP gene family that has been cloned from genomic DNA contains the entire uninterrupted structural sequence for the MSP in addition to a 5' flanking sequence containing a promoter-like region with the classic TATA box, a sequence resembling the CAAT box, and a putative ribosome-binding sequence. The 3' trailing sequences of the genomic and the cDNA clones contain an AATAAA polyadenylation site.
Does a relationship exist between a protein's evolutionary rate and its number of interactions? This relationship has been put forward many times, based on a biological premise that a highly interacting protein will be more restricted in its sequence changes. However, to date several studies have voiced conflicting views on the presence or absence of such a relationship.
Here we perform a large scale study over multiple data sets in order to demonstrate that the major reason for conflict between previous studies is the use of different but overlapping datasets. We show that lack of correlation, between evolutionary rate and number of interactions in a data set is related to the error rate. We also demonstrate that the correlation is not an artifact of the underlying distributions of evolutionary distance and interactions and is therefore likely to be biologically relevant. Further to this, we consider the claim that the dependence is due to gene expression levels and find some supporting evidence. A strong and positive correlation between the number of interactions and the age of a protein is also observed and we show this relationship is independent of expression levels.
A correlation between number of interactions and evolutionary rate is observed but is dependent on the accuracy of the dataset being used. However it appears that the number of interactions a protein participates in depends more on the age of the protein than the rate at which it changes.
Using forward genetics, we have identified the genes mutated in two classes of zebrafish fin mutants. The mutants of the first class are characterized by defects in embryonic fin morphogenesis, which are due to mutations in a Laminin subunit or an Integrin alpha receptor, respectively. The mutants of the second class display characteristic blistering underneath the basement membrane of the fin epidermis. Three of them are due to mutations in zebrafish orthologues of FRAS1, FREM1, or FREM2, large basement membrane protein encoding genes that are mutated in mouse bleb mutants and in human patients suffering from Fraser Syndrome, a rare congenital condition characterized by syndactyly and cryptophthalmos. Fin blistering in a fourth group of zebrafish mutants is caused by mutations in Hemicentin1 (Hmcn1), another large extracellular matrix protein the function of which in vertebrates was hitherto unknown. Our mutant and dose-dependent interaction data suggest a potential involvement of Hmcn1 in Fraser complex-dependent basement membrane anchorage. Furthermore, we present biochemical and genetic data suggesting a role for the proprotein convertase FurinA in zebrafish fin development and cell surface shedding of Fras1 and Frem2, thereby allowing proper localization of the proteins within the basement membrane of forming fins. Finally, we identify the extracellular matrix protein Fibrillin2 as an indispensable interaction partner of Hmcn1. Thus we have defined a series of zebrafish mutants modelling Fraser Syndrome and have identified several implicated novel genes that might help to further elucidate the mechanisms of basement membrane anchorage and of the disease's aetiology. In addition, the novel genes might prove helpful to unravel the molecular nature of thus far unresolved cases of the human disease.
There are a large number of human genetic syndromes with limb and digit deformities. It has been shown that the genes underlying these syndromes are well conserved in evolution, and most perform the same role even in the fins of fish. One such human syndrome is Fraser Syndrome, characterized by a number of defects including fusion of the fingers (syndactyly). Data obtained with corresponding mouse mutants suggest that all of these defects are due to transient basement membrane disruptions and epithelial blistering during development. Whilst some of the Fraser Syndrome genes have been identified, others are unknown. We show that mutation of the known Fraser Syndrome genes in zebrafish generate comparable blistering defects in the fins. Importantly, we have also identified additional genes and mechanisms required for the same processes. Included in this are hemicentin1, a gene whose function had thus far only been studied in nematodes, and furinA, encoding a proprotein convertase, for which we reveal a novel role in ectodomain shedding of Fras/Frem proteins. This work thus expands our understanding, not only of Fraser Syndrome, but also of the common processes of basement membrane formation and function during fin and limb development.
FRAS1 is mutated in some individuals with Fraser syndrome (FS) and the encoded protein is expressed in embryonic epidermal cells, localizing in their basement membrane (BM). Syndactyly and cryptophthalmos in FS are sequelae of skin fragility but the bases for associated kidney malformations are unclear. We demonstrate that Fras1 is expressed in the branching ureteric bud (UB), and that renal agenesis occurs in homozygous Fras1 null mutant blebbed (bl) mice on a C57BL6J background. In vivo, the bl/bl bud fails to invade metanephric mesenchyme which undergoes involution, events replicated in organ culture. The expression of glial cell line-derived neurotrophic factor and growth-differentiation factor 11 was defective in bl/bl renal primordia in vivo, whereas, in culture, the addition of either growth factor restored bud invasion into the mesenchyme. Mutant primordia also showed deficient expression of Hoxd11 and Six2 transcription factors, whereas the activity of bone morphogenetic protein 4, an anti-branching molecule, was upregulated. In wild types, Fras1 was also expressed by nascent nephrons. Foetal glomerular podocytes expressed Fras1 transcripts and Fras1 immunolocalized in a glomerular BM-like pattern. On a mixed background, bl mutants, and also compound mutants for bl and my, another bleb strain, sometimes survive into adulthood. These mice have two kidneys, which contain subsets of glomeruli with perturbed nephrin, podocin, integrin α3 and fibronectin expression. Thus, Fras1 protein coats branching UB epithelia and is strikingly upregulated in the nephron lineage after mesenchymal/epithelial transition. Fras1 deficiency causes defective interactions between the bud and mesenchyme, correlating with disturbed expression of key nephrogenic molecules. Furthermore, Fras1 may also be required for the formation of normal glomeruli.
Proteins show a broad range of evolutionary rates. Understanding the factors that are responsible for the characteristic rate of evolution of a given protein arguably is one of the major goals of evolutionary biology. A long-standing general assumption used to be that the evolution rate is, primarily, determined by the specific functional constraints that affect the given protein. These constrains were traditionally thought to depend both on the specific features of the protein's structure and its biological role. The advent of systems biology brought about new types of data, such as expression level and protein-protein interactions, and unexpectedly, a variety of correlations between protein evolution rate and these variables have been observed. The strongest connections by far were repeatedly seen between protein sequence evolution rate and the expression level of the respective gene. It has been hypothesized that this link is due to the selection for the robustness of the protein structure to mistranslation-induced misfolding that is particularly important for highly expressed proteins and is the dominant determinant of the sequence evolution rate.
This work is an attempt to assess the relative contributions of protein domain structure and function, on the one hand, and expression level on the other hand, to the rate of sequence evolution. To this end, we performed a genome-wide analysis of the effect of the fusion of a pair of domains in multidomain proteins on the difference in the domain-specific evolutionary rates. The mistranslation-induced misfolding hypothesis would predict that, within multidomain proteins, fused domains, on average, should evolve at substantially closer rates than the same domains in different proteins because, within a mutlidomain protein, all domains are translated at the same rate. We performed a comprehensive comparison of the evolutionary rates of mammalian and plant protein domains that are either joined in multidomain proteins or contained in distinct proteins. Substantial homogenization of evolutionary rates in multidomain proteins was, indeed, observed in both animals and plants, although highly significant differences between domain-specific rates remained. The contributions of the translation rate, as determined by the effect of the fusion of a pair of domains within a multidomain protein, and intrinsic, domain-specific structural-functional constraints appear to be comparable in magnitude.
Fusion of domains in a multidomain protein results in substantial homogenization of the domain-specific evolutionary rates but significant differences between domain-specific evolution rates remain. Thus, the rate of translation and intrinsic structural-functional constraints both exert sizable and comparable effects on sequence evolution.
This article was reviewed by Sergei Maslov, Dennis Vitkup, Claus Wilke (nominated by Orly Alter), and Allan Drummond (nominated by Joel Bader). For the full reviews, please go to the Reviewers' Reports section.
Gene transfer of the conjugative plasmid pBF1 from Pseudomonas putida to indigenous bacteria in seawater was investigated with a detection system for gene transfer based on the green fluorescent protein (GFP) (C. Dahlberg et al., Mol. Biol. Evol. 15:385–390, 1998). pBF1 was tagged with the gfp gene controlled by a lac promoter which is down regulated in the donor cell by a chromosomal repressor (lacIq). The plasmid donor cells (Pseudomonas putida KT2442) subsequently do not express gfp. Transfer to recipient strains lacking the repressor results in expression of gfp. The transconjugant can subsequently be detected by epifluorescence microscopy on a single-cell level. By using this method, transfer of pBF1::gfp and expression of the gfp gene were first shown to occur during nutrient-limiting conditions to several defined recipient bacteria in artificial seawater. Second, we measured transfer of pBF1 from P. putida to the marine bacterial community directly in seawater samples, on a single-cell level, without limiting the detection of gene transfer to the culturable fraction of bacteria. Plasmid transfer was detected on surfaces and in bulk seawater. Seawater bacteria with different morphologies were shown to receive the plasmid. Gene transfer frequencies of 2.3 × 10−6 to 2.2 × 10−4 transconjugants per recipient were recorded after 3 days of incubation.
It is of fundamental importance to understand the determinants of the rate of protein evolution. Eukaryotic extracellular proteins are known to evolve faster than intracellular proteins. Although this rate difference appears to be due to the lower essentiality of extracellular proteins than intracellular proteins in yeast, we here show that, in mammals, the impact of extracellularity is independent from the impact of gene essentiality. Our partial correlation analysis indicated that the impact of extracellularity on mammalian protein evolutionary rate is also independent from those of tissue-specificity, expression level, gene compactness, and the number of protein–protein interactions and, surprisingly, is the strongest among all the factors we examined. Similar results were also found from principal component regression analysis. Our findings suggest that different rules govern the pace of protein sequence evolution in mammals and yeasts.
evolutionary rate; subcellular localization; gene essentiality; gene expression level; mammal; yeast
YB-1 is a member of the numerous families of proteins with an evolutionary ancient cold-shock domain. It is involved in many DNA- and RNA-dependent events and regulates gene expression at different levels. Previously, we found a regulatory element within the 3′ untranslated region (UTR) of YB-1 mRNA that specifically interacted with YB-1 and poly(A)-binding protein (PABP); we also showed that PABP positively affected YB-1 mRNA translation in a poly(A) tail-independent manner (O. V. Skabkina, M. A. Skabkin, N. V. Popova, D. N. Lyabin, L. O. Penalva, and L. P. Ovchinnikov, J. Biol. Chem. 278:18191-18198, 2003). Here, YB-1 is shown to strongly and specifically inhibit its own synthesis at the stage of initiation, with accumulation of its mRNA in the form of free mRNPs. YB-1 and PABP binding sites have been mapped on the YB-1 mRNA regulatory element. These were UCCAG/ACAA for YB-1 and a ∼50-nucleotide A-rich sequence for PABP that overlapped each other. PABP competes with YB-1 for binding to the YB-1 mRNA regulatory element and restores translational activity of YB-1 mRNA that has been inhibited by YB-1. Thus, YB-1 negatively regulates its own synthesis, presumably by specific interaction with the 3′UTR regulatory element, whereas PABP restores translational activity of YB-1 mRNA by displacing YB-1 from this element.
Correction to Wu DD, Irwin DM, Zhang YP: Molecular evolution of the keratin associated protein gene family in mammals, role in the evolution of mammalian hair. BMC Evol Biol 2008, 8:241.
A long-standing assumption in evolutionary biology is that the evolution rate of protein-coding genes depends, largely, on specific constraints that affect the function of the given protein. However, recent research in evolutionary systems biology revealed unexpected, significant correlations between evolution rate and characteristics of genes or proteins that are not directly related to specific protein functions, such as expression level and protein–protein interactions. The strongest connections were consistently detected between protein sequence evolution rate and the expression level of the respective gene. A recent genome-wide proteomic study revealed an extremely strong correlation between the abundances of orthologous proteins in distantly related animals, the nematode Caenorhabditis elegans and the fruit fly Drosophila melanogaster. We used the extensive protein abundance data from this study along with short-term evolutionary rates (ERs) of orthologous genes in nematodes and flies to estimate the relative contributions of structural–functional constraints and the translation rate to the evolution rate of protein-coding genes. Together the intrinsic constraints and translation rate account for approximately 50% of the variance of the ERs. The contribution of constraints is estimated to be 3- to 5-fold greater than the contribution of translation rate.
protein evolution; structural–functional constraints; misfolding; protein abundance
Among the many factors determining protein evolutionary rate, protein-protein interaction degree (PPID) has been intensively investigated in recent years, but its precise effect on protein evolutionary rate is still heavily debated.
We first confirmed that the correlation between protein evolutionary rate and PPID varies considerably across different protein interaction datasets. Specifically, because of the maximal inconsistency between yeast two-hybrid and other datasets, we reasoned that the difference in experimental methods contributes to our inability to clearly define how PPID affects protein evolutionary rate. To address this, we integrated protein interaction and gene co-expression data to derive a co-expressed protein-protein interaction degree (ePPID) measure, which reflects the number of partners with which a protein can permanently interact. Thus, irrespective of the experimental method employed, we found that (1) ePPID is a better predictor of protein evolutionary rate than PPID, (2) ePPID is a more robust predictor of protein evolutionary rate than PPID, and (3) the contribution of ePPID to protein evolutionary rate is statistically independent of expression level. Analysis of hub proteins in the Structural Interaction Network further supported ePPID as a better predictor of protein evolutionary rate than the number of distinct binding interfaces and clarified the slower evolution of co-expressed multi-interface hub proteins over that of other hub proteins.
Our study firmly established ePPID as a robust predictor of protein evolutionary rate, irrespective of experimental method, and underscored the importance of permanent interactions in shaping the evolutionary outcome.
Evolutionary rates of proteins in a protein-protein interaction network are primarily governed by the protein connectivity and/or expression level. A recent study revealed the importance of the features of the interacting protein partners, viz., the coefficient of functionality and clustering coefficient in controlling the protein evolutionary rates in a protein-protein interaction (PPI) network.
By multivariate regression analysis we found that the three parameters: probability of complex formation, expression level and degree of a protein independently guide the evolutionary rates of proteins in the PPI network. The contribution of the complex forming property of a protein and its expression level led to nearly 43% of the total variation as observed from the first principal component. We also found that for complex forming proteins in the network, those which have partners sharing the same functional class evolve faster than those having partners belonging to different functional classes. The proteins in the dense parts of the network evolve faster than their counterparts which are present in the sparse regions of the network. Taking into account the complex forming ability, we found that all the complex forming proteins considered in this study evolve slower than the non-complex forming proteins irrespective of their localization in the network or the affiliation of their partners to same/different functional classes.
We have shown here that the functionality and clustering coefficient correlated with the degree of the protein in the protein-protein interaction network. We have identified the significant relationship of the complex-forming property of proteins and their evolutionary rates even when they are classified according to the features of their interacting partners. Our study implies that the evolutionarily constrained proteins are actually members of a larger number of protein complexes and this justifies why they have enhanced expression levels.
Genome-wide studies in Saccharomyces cerevisiae concluded that the dominant determinant of protein evolutionary rates is expression level, where highly-expressed proteins evolve most slowly. To determine how this constraint affects the evolution of protein interactions, we directly measure evolutionary rates of protein interface, surface and core residues by structurally mapping domain interactions to yeast genomes. We find that mRNA level and protein abundance, though correlated, report on pressures affecting regions of proteins differently. Pressures proportional to mRNA level slow evolutionary rates of all structural regions and reduce the variability in rate differences between interfaces and other surfaces. In contrast, the evolutionary rate variation within a domain is less dependent on protein abundance. Distinct pressures may be associated primarily with the cost (mRNA level) and functional benefit (protein abundance) of protein production. Interfaces of proteins with low mRNA levels may have higher evolutionary flexibility, and could constitute the raw material for new functions.
Protein interaction networks aim to summarize the complex interplay of proteins in an organism. Early studies suggested that the position of a protein in the network determines its evolutionary rate but there has been considerable disagreement as to what extent other factors, such as protein abundance, modify this reported dependence.
We compare the genomes of Saccharomyces cerevisiae and Caenorhabditis elegans with those of closely related species to elucidate the recent evolutionary history of their respective protein interaction networks. Interaction and expression data are studied in the light of a detailed phylogenetic analysis. The underlying network structure is incorporated explicitly into the statistical analysis. The increased phylogenetic resolution, paired with high-quality interaction data, allows us to resolve the way in which protein interaction network structure and abundance of proteins affect the evolutionary rate. We find that expression levels are better predictors of the evolutionary rate than a protein's connectivity. Detailed analysis of the two organisms also shows that the evolutionary rates of interacting proteins are not sufficiently similar to be mutually predictive.
It appears that meaningful inferences about the evolution of protein interaction networks require comparative analysis of reasonably closely related species. The signature of protein evolution is shaped by a protein's abundance in the organism and its function and the biological process it is involved in. Its position in the interaction networks and its connectivity may modulate this but they appear to have only minor influence on a protein's evolutionary rate.
It has been suggested that rates of protein evolution are influenced, to a great extent, by the proportion of amino acid residues that are directly involved in protein function. In agreement with this hypothesis, recent work has shown a negative correlation between evolutionary rates and the number of protein-protein interactions. However, the extent to which the number of protein-protein interactions influences evolutionary rates remains unclear. Here, we address this question at several different levels of evolutionary relatedness.
Manually curated data on the number of protein-protein interactions among Saccharomyces cerevisiae proteins was examined for possible correlation with evolutionary rates between S. cerevisiae and Schizosaccharomyces pombe orthologs. Only a very weak negative correlation between the number of interactions and evolutionary rate of a protein was observed. Furthermore, no relationship was found between a more general measure of the evolutionary conservation of S. cerevisiae proteins, based on the taxonomic distribution of their homologs, and the number of protein-protein interactions. However, when the proteins from yeast were assorted into discrete bins according to the number of interactions, it turned out that 6.5% of the proteins with the greatest number of interactions evolved, on average, significantly slower than the rest of the proteins. Comparisons were also performed using protein-protein interaction data obtained with high-throughput analysis of Helicobacter pylori proteins. No convincing relationship between the number of protein-protein interactions and evolutionary rates was detected, either for comparisons of orthologs from two completely sequenced H. pylori strains or for comparisons of H. pylori and Campylobacter jejuni orthologs, even when the proteins were classified into bins by the number of interactions.
The currently available comparative-genomic data do not support the hypothesis that the evolutionary rates of the majority of proteins substantially depend on the number of protein-protein interactions they are involved in. However, a small fraction of yeast proteins with the largest number of interactions (the hubs of the interaction network) tend to evolve slower than the bulk of the proteins.
Myeloperoxidase (MPO) is a member of the mammalian heme peroxidase (MHP) multigene family. Whereas all MHPs oxidize specific halides to generate the corresponding hypohalous acid, MPO is unique in its capacity to oxidize chloride at physiologic pH to produce hypochlorous acid (HOCl), a potent microbicide that contributes to neutrophil-mediated host defense against infection. We have previously resolved the evolutionary relationships in this functionally diverse multigene family and predicted in silico that positive Darwinian selection played a major role in the observed functional diversities (Loughran NB, O'Connor B, O'Fagain C, O'Connell MJ. 2008. The phylogeny of the mammalian heme peroxidases and the evolution of their diverse functions. BMC Evol Biol. 8:101). In this work, we have replaced positively selected residues asparagine 496 (N496), tyrosine 500 (Y500), and leucine 504 (L504) with the amino acids present in the ancestral MHP and have examined the effects on the structure, biosynthesis, and activity of MPO. Analysis in silico predicted that N496F, Y500F, or L504T would perturb hydrogen bonding in the heme pocket of MPO and thus disrupt the structural integrity of the enzyme. Biosynthesis of the mutants stably expressed in human embryonic kidney 293 cells yielded apoproMPO, the heme-free, enzymatically inactive precursor of MPO, that failed to undergo normal maturation or proteolytic processing. As a consequence of the maturational arrest at the apoproMPO stage of development, cells expressing MPO with mutations N496F, Y500F, L504T, individually or in combination, lacked normal peroxidase or chlorinating activity. Taken together, our data provide further support for the in silico predictions of positive selection and highlight the correlation between positive selection and functional divergence. Our data demonstrate that directly probing the functional importance of positive selection can provide important insights into understanding protein evolution.
myeloperoxidase; animal peroxidase family; positive selection; protein evolution; Darwinian selection; functional shift
There is increasing demand to test hypotheses that contrast the evolution of genes and gene families among genomes, using simulations that work across these levels of organization. The EvolSimulator program was developed recently to provide a highly flexible platform for forward simulations of amino acid evolution in multiple related lineages of haploid genomes, permitting copy number variation and lateral gene transfer. Synonymous nucleotide evolution is not currently supported, however, and would be highly advantageous for comparisons to full genome, transcriptome, and single nucleotide polymorphism (SNP) datasets. In addition, EvolSimulator creates new genomes for each simulation, and does not allow the input of user-specified sequences and gene family information, limiting the incorporation of further biological realism and/or user manipulations of the data.
We present modified C++ source code for the EvolSimulator platform, which we provide as the extension module NU-IN. With NU-IN, synonymous and non-synonymous nucleotide evolution is fully implemented, and the user has the ability to use real or previously-simulated sequence data to initiate a simulation of one or more lineages. Gene family membership can be optionally specified, as well as gene retention probabilities that model biased gene retention. We provide PERL scripts to assist the user in deriving this information from previous simulations. We demonstrate the features of NU-IN by simulating genome duplication (polyploidy) in the presence of ongoing copy number variation in an evolving lineage. This example is initiated with real genomic data, and produces output that we analyse directly with existing bioinformatic pipelines.
The NU-IN extension module is a publicly available open source software (GNU GPLv3 license) extension to EvolSimulator. With the NU-IN module, users are now able to simulate both drift and selection at the nucleotide, amino acid, copy number, and gene family levels across sets of related genomes, for user-specified starting sequences and associated parameters. These features can be used to generate simulated genomic datasets under an extremely broad array of conditions, and with a high degree of biological realism.
The C2H2 zinc-finger (ZNF) containing gene family is one of the largest and most complex gene families in metazoan genomes. These genes are known to exist in almost all eukaryotes, and they constitute a major subset of eukaryotic transcription factors. The genes of this family usually occur as clusters in genomes and are thought to have undergone a massive expansion in vertebrates by multiple tandem duplication events (BMC Evol Biol 8:176, 2008).
In this study, we combined two popular approaches for homolog detection, Reciprocal Best Hit (RBH) (Proc Natl Acad Sci USA 95:6239–6244, 1998) and Hidden–Markov model (HMM) profiles search (Bioinformatics 14:755-763, 1998), on a diverse set of complete genomes of 124 eukaryotic species ranging from excavates to humans to identify all detectable members of 37 C2H2 ZNF gene families. We succeeded in identifying 3,890 genes as distinct members of 37 C2H2 gene families. These 37 families are distributed among the eukaryotes as progressive additions of gene blocks with increasing complexity of the organisms. The first block featuring the protists had 7 families, the second block featuring plants had 2 families, the third block featuring the fungi had 2 families (one of which was also present in plants) and the final block consisted of metazoans with 25 families. Among the metazoans, the simpler unicellular metazoans had just 15 of the 25 families while most of the bilaterians had all 25 families making up a total of 37 families. Multiple potential examples of lineage-specific gene duplications and gene losses were also observed.
Our hybrid approach combines features of the both RBH and HMM methods for homolog detection. This largely automated technique is much faster than manual methods and is able to detect homologs accurately and efficiently among a diverse set of organisms. Our analysis of the 37 evolutionarily conserved C2H2 ZNF gene families revealed a stepwise appearance of ZNF families, agreeing well with the phylogenetic relationship of the organisms compared and their presumed stepwise increase in complexity (Science 300:1694, 2003).
C2H2 Zinc Finger Genes; Family Expansion; Orthologs Detection; HMM; RBH
Bone marrow-derived cells (BMCs) and inflammatory chemokine receptors regulate arteriogenesis and angiogenesis. Here, we tested whether arteriolar remodeling in response to an inflammatory stimulus is dependent on BMC-specific chemokine (C-C motif) receptor 2 (CCR2) expression and whether this response involves BMC transdifferentiation into smooth muscle.
Methods and Results
Dorsal skinfold window chambers were implanted into C57Bl/6 wild-type (WT) mice, as well as the following bone marrow chimeras (donor-host): WT-WT, CCR2−/−-WT, WT-CCR2−/−, and EGFP+-WT. One day after implantation, tissue MCP-1 levels rose from “undetectable” to 463pg/mg, and the number of EGFP+ cells increased more than 4-fold, indicating marked inflammation. A 66% (28μm) increase in maximum arteriolar diameter was observed over 7 days in WT-WT mice. This arteriolar remodeling response was completely abolished in CCR2−/−-WT mice but largely rescued in WT-CCR2−/− mice. EGFP+ BMCs were numerous throughout the tissue, but we found no evidence that EGFP+ BMCs transdifferentiate into smooth muscle, based on examination of >800 arterioles and venules.
BMC-specific CCR2 expression is required for injury/inflammation-associated arteriolar remodeling, but this response is not characterized by the differentiation of BMCs into smooth muscle.