Epigenetic modifications of DNA, such as cytosine methylation are differentially abundant in diseases such as cancer. A goal for clinical research is finding sites that are differentially methylated between groups of samples to act as potential biomarkers for disease outcome. However, clinical samples are often limited in availability, represent a heterogeneous collection of cells or are of uncertain clinical class. Array-based methods for identification of methylation provide a cost-effective method to survey a proportion of the methylome at single base resolution. The Illumina Infinium array has become a popular and reliable high throughput method in this field and are proving useful in the identification of biomarkers for disease. Here, we compare a commonly used statistical test with a new intuitive and flexible computational approach to quickly detect differentially methylated sites. The method rapidly identifies and ranks candidate lists with greatest inter-group variability whilst controlling for intra-group variability. Intuitive and biologically relevant filters can be imposed to quickly identify sites and genes of interest.
DNA methylation array; Infinium 450k; biomarker discovery; differential methylation; epigenetics; DNA methylome; epigenomics; NIMBL
Understanding the origins and evolution of synapses may provide insight into species diversity and organisation of the brain. Using comparative proteomics and genomics we examined the evolution of the postsynaptic density (PSD) and MAGUK associated signalling complexes (MASCs) underlying learning and memory. PSD/MASC orthologues found in yeast perform basic cellular functions regulating protein synthesis and structural plasticity. Striking changes in signalling complexity were observed at the yeast:metazoan and invertebrate:vertebrate boundaries, with expansion of key synapse components, notably receptors, adhesion/cytoskeletal and scaffold proteins. Proteomic comparison of Drosophila and mouse MASCs revealed species-specific adaptation with greater signalling complexity in mouse. Although synapse components were conserved amongst diverse vertebrate species, mapping mRNA and protein expression within the mouse brain showed vertebrate-specific components preferentially contributed to differences between brain regions. We propose that evolution of synapse complexity around a core proto-synapse has contributed to invertebrate–vertebrate differences and to brain specialisation.
A classical example of repeated speciation coupled with ecological diversification is the evolution of 14 closely related species of Darwin’s (Galápagos) finches (Thraupidae, Passeriformes). Their adaptive radiation in the Galápagos archipelago took place in the last 2–3 million years and some of the molecular mechanisms that led to their diversification are now being elucidated. Here we report evolutionary analyses of genome of the large ground finch, Geospiza magnirostris.
13,291 protein-coding genes were predicted from a 991.0 Mb G. magnirostris genome assembly. We then defined gene orthology relationships and constructed whole genome alignments between the G. magnirostris and other vertebrate genomes. We estimate that 15% of genomic sequence is functionally constrained between G. magnirostris and zebra finch. Genic evolutionary rate comparisons indicate that similar selective pressures acted along the G. magnirostris and zebra finch lineages suggesting that historical effective population size values have been similar in both lineages. 21 otherwise highly conserved genes were identified that each show evidence for positive selection on amino acid changes in the Darwin's finch lineage. Two of these genes (Igf2r and Pou1f1) have been implicated in beak morphology changes in Darwin’s finches. Five of 47 genes showing evidence of positive selection in early passerine evolution have cilia related functions, and may be examples of adaptively evolving reproductive proteins.
These results provide insights into past evolutionary processes that have shaped G. magnirostris genes and its genome, and provide the necessary foundation upon which to build population genomics resources that will shed light on more contemporaneous adaptive and non-adaptive processes that have contributed to the evolution of the Darwin’s finches.
Genomics; Evolution; Darwin’s finches; Large ground finch; Geospiza magnirostris
The use of genome-wide methylation arrays has proved very informative to investigate both clinical and biological questions in human epigenomics. The use of clustering methods either for exploration of these data or to compare to an a priori grouping, e.g., normal versus disease allows assessment of groupings of data without user bias. However no consensus on the methods to use for clustering of methylation array approaches has been reached. To determine the most appropriate clustering method for analysis of illumina array methylation data, a collection of data sets was simulated and used to compare clustering methods. Both hierarchical clustering and non-hierarchical clustering methods (k-means, k-medoids, and fuzzy clustering algorithms) were compared using a range of distance and linkage methods. As no single method consistently outperformed others across different simulations, we propose a method to capture the best clustering outcome based on an additional measure, the silhouette width. This approach produced a consistently higher cluster accuracy compared to using any one method in isolation.
hierarchical; k-means; k-medoids; epigenomics; epigenetics; illumina; infinium
The thermoregulatory function of brown adipose tissue (BAT) is due to the tissue-specific expression of uncoupling protein 1 (UCP1) which is thought to have evolved in early mammals. We report that a CpG island close to the UCP1 transcription start site is highly conserved in all 29 vertebrates examined apart from the mouse and xenopus. Using methylation sensitive restriction digest and bisulfite mapping we show that the CpG island in both the bovine and human is largely un-methylated and is not related to differences in UCP1 expression between white and BAT. Tissue-specific expression of UCP1 has been proposed to be regulated by a conserved 5′ distal enhancer which has been reported to be absent in marsupials. We demonstrate that the enhancer, is also absent in five eutherians as well as marsupials, monotremes, amphibians, and fish, is present in pigs despite UCP1 having become a pseudogene, and that absence of the enhancer element does not relate to BAT-specific UCP1 expression. We identify an additional putative 5′ regulatory unit which is conserved in 14 eutherian species but absent in other eutherians and vertebrates, but again unrelated to UCP1 expression. We conclude that despite clear evidence of conservation of regulatory elements in the UCP1 5′ untranslated region, this does not appear to be related to species or tissues-specific expression of UCP1.
CpG islands; methylation; uncoupling protein 1; phylogenic analysis
Previous studies have proposed that mammalian toll like receptors (TLRs) have evolved under diversifying selection due to their role in pathogen detection. To determine if this is the case, we examined the extent of adaptive evolution in the TLR5 gene in both individual species and defined clades of the mammalia.
In support of previous studies, we find evidence of adaptive evolution of mammalian TLR5. However, we also show that TLR5 genes of domestic livestock have a concentration of single nucleotide polymorphisms suggesting a specific signature of adaptation. Using codon models of evolution we have identified a concentration of rapidly evolving codons within the TLR5 extracellular domain a site of interaction between host and the bacterial surface protein flagellin.
The results suggest that interactions between pathogen and host may be driving adaptive change in TLR5 by competition between species. In support of this, we have identified single nucleotide polymorphisms (SNP) in sheep and cattle TLR5 genes that are co-localised and co-incident with the predicted adaptive codons suggesting that adaptation in this region of the TLR5 gene is on-going in domestic species.
Toll-like receptor; SNP; Adaptive evolution; Positive selection; Sheep; Cattle
The expression and function of embryonic myosin heavy chain (eMYH) has not been investigated within the early developing heart. This is despite the knowledge that other structural proteins, such as alpha and beta myosin heavy chains and cardiac alpha actin, play crucial roles in atrial septal development and cardiac function. Most cases of atrial septal defects and cardiomyopathy are not associated with a known causative gene, suggesting that further analysis into candidate genes is required. Expression studies localised eMYH in the developing chick heart. eMYH knockdown was achieved using morpholinos in a temporal manner and functional studies were carried out using electrical and calcium signalling methodologies. Knockdown in the early embryo led to abnormal atrial septal development and heart enlargement. Intriguingly, action potentials of the eMYH knockdown hearts were abnormal in comparison with the alpha and beta myosin heavy chain knockdowns and controls. Although myofibrillogenesis appeared normal, in knockdown hearts the tissue integrity was affected owing to apparent focal points of myocyte loss and an increase in cell death. An expression profile of human skeletal myosin heavy chain genes suggests that human myosin heavy chain 3 is the functional homologue of the chick eMYH gene. These data provide compelling evidence that eMYH plays a crucial role in important processes in the early developing heart and, hence, is a candidate causative gene for atrial septal defects and cardiomyopathy.
Atrial septal development; Cardiomyopathy; Myosin; Chick
Supplementation with folic acid during pregnancy is known to reduce the risk of neural tube defects and low birth weight. It is thought that folate and other one-carbon intermediates might secure these clinical effects via DNA methylation. We examined the effects of folate on the human methylome using quantitative interrogation of 27,578 CpG loci associated with 14,496 genes at single-nucleotide resolution across 12 fetal cord blood samples. Consistent with previous studies, the majority of CpG dinucleotides located within CpG islands exhibited hypomethylation while those outside CpG islands showed mid-high methylation. However, for the first time in human samples, unbiased analysis of methylation across samples revealed a significant correlation of methylation patterns with plasma homocysteine, LINE-1 methylation and birth weight centile. Additionally, CpG methylation significantly correlated with either birth weight or LINE-1 methylation were predominantly located in CpG islands. These data indicate that levels of folate-associated intermediates in cord blood reflect their influence and consequences for the fetal epigenome and potentially on pregnancy outcome. In these cases, their influence might be exerted during late gestation or reflect those present during the peri-conceptual period.
cord blood; birth weight; folic acid; homocysteine; BeadArray; hierarchical clustering; Illumina
Hsp-90 from the free-living nematode Caenorhabditis elegans is unique in that it fails to bind to the specific Hsp-90 inhibitor, geldanamycin (GA). Here we surveyed 24 different free-living or parasitic nematodes with the aim of determining whether C. elegans Hsp-90 was the exception or the norm amongst the nematodes. We combined these data with codon evolution models in an attempt to identify whether hsp-90 from GA-binding and non-binding species has evolved under different evolutionary constraints.
We show that GA-binding is associated with life history: free-living nematodes and those parasitic species with free-living larval stages failed to bind GA. In contrast, obligate parasites and those worms in which the free-living stage in the environment is enclosed within a resistant egg, possess a GA-binding Hsp-90. We analysed Hsp-90 sequences from fifteen nematode species to determine whether nematode hsp-90s have undergone adaptive evolution that influences GA-binding. Our data provide evidence of rapid diversifying selection in the evolution of the hsp-90 gene along three separate lineages, and identified a number of residues showing significant evidence of adaptive evolution. However, we were unable to prove that the selection observed is correlated with the ability to bind geldanamycin or not.
Hsp-90 is a multi-functional protein and the rapid evolution of the hsp-90 gene presumably correlates with other key cellular functions. Factors other than primary amino acid sequence may influence the ability of Hsp-90 to bind to geldanamycin.
Related species, such as humans and chimpanzees, often experience the same disease with varying degrees of pathology, as seen in the cases of Alzheimer's disease, or differing symptomatology as in AIDS. Furthermore, certain diseases such as schizophrenia, epithelial cancers and autoimmune disorders are far more frequent in humans than in other species for reasons not associated with lifestyle. Genes that have undergone positive selection during species evolution are indicative of functional adaptations that drive species differences. Thus we investigate whether biomedical disease differences between species can be attributed to positively selected genes.
We identified genes that putatively underwent positive selection during the evolution of humans and four mammals which are often used to model human diseases (mouse, rat, chimpanzee and dog). We show that genes predicted to have been subject to positive selection pressure during human evolution are implicated in diseases such as epithelial cancers, schizophrenia, autoimmune diseases and Alzheimer's disease, all of which differ in prevalence and symptomatology between humans and their mammalian relatives.
In agreement with previous studies, the chimpanzee lineage was found to have more genes under positive selection than any of the other lineages. In addition, we found new evidence to support the hypothesis that genes that have undergone positive selection tend to interact with each other. This is the first such evidence to be detected widely among mammalian genes and may be important in identifying molecular pathways causative of species differences.
Our dataset of genes predicted to have been subject to positive selection in five species serves as an informative resource that can be consulted prior to selecting appropriate animal models during drug target validation. We conclude that studying the evolution of functional and biomedical disease differences between species is an important way to gain insight into their molecular causes and may provide a method to predict when animal models do not mirror human biology.
Whole genome studies have highlighted duplicated genes as important substrates for adaptive evolution. We have investigated adaptive evolution in this class of genes in the human parasite Trypanosoma brucei, as indicated by the ratio of non-synonymous (amino-acid changing) to synonymous (amino acid retaining) nucleotide substitution rates.
We have identified duplicated genes that are most rapidly evolving in this important human parasite. This is the first attempt to investigate adaptive evolution in this species at the codon level. We identify 109 genes within 23 clusters of paralogous gene expansions to be subject to positive selection.
Genes identified include surface antigens in both the mammalian and insect host life cycle stage suggesting that competitive interaction is not solely with the adaptive immune system of the mammalian host. Also surface transporters related to drug resistance and genes related to developmental progression are detected. We discuss how adaptive evolution of these genes may highlight lineage specific processes essential for parasite survival. We also discuss the implications of adaptive evolution of these targets for parasite biology and control.
Glutamate gated postsynaptic receptors in the central nervous system (CNS) are essential for environmentally stimulated behaviours including learning and memory in both invertebrates and vertebrates. Though their genetics, biochemistry, physiology, and role in behaviour have been intensely studied in vitro and in vivo, their molecular evolution and structural aspects remain poorly understood. To understand how these receptors have evolved different physiological requirements we have investigated the molecular evolution of glutamate gated receptors and ion channels, in particular the N-methyl-D-aspartate (NMDA) receptor, which is essential for higher cognitive function. Studies of rodent NMDA receptors show that the C-terminal intracellular domain forms a signalling complex with enzymes and scaffold proteins, which is important for neuronal and behavioural plasticity
The vertebrate NMDA receptor was found to have subunits with C-terminal domains up to 500 amino acids longer than invertebrates. This extension was specific to the NR2 subunit and occurred before the duplication and subsequent divergence of NR2 in the vertebrate lineage. The shorter invertebrate C-terminus lacked vertebrate protein interaction motifs involved with forming a signaling complex although the terminal PDZ interaction domain was conserved. The vertebrate NR2 C-terminal domain was predicted to be intrinsically disordered but with a conserved secondary structure.
We highlight an evolutionary adaptation specific to vertebrate NMDA receptor NR2 subunits. Using in silico methods we find that evolution has shaped the NMDA receptor C-terminus into an unstructured but modular intracellular domain that parallels the expansion in complexity of an NMDA receptor signalling complex in the vertebrate lineage. We propose the NR2 C-terminus has evolved to be a natively unstructured yet flexible hub organising postsynaptic signalling. The evolution of the NR2 C-terminus and its associated signalling complex may contribute to species differences in behaviour and in particular cognitive function.
The genes for salivary androgen-binding protein (ABP) subunits have been evolving rapidly in ancestors of the house mouse Mus musculus, as evidenced both by recent and extensive gene duplication and by high ratios of nonsynonymous to synonymous nucleotide substitution rates. This makes ABP an appropriate model system with which to investigate how recent adaptive evolution of paralogous genes results in functional innovation (neofunctionalization).
It was our goal to find evidence for the expression of as many of the Abp paralogues in the mouse genome as possible. We observed expression of six Abpa paralogues and five Abpbg paralogues in ten glands and other organs located predominantly in the head and neck (olfactory lobe of the brain, three salivary glands, lacrimal gland, Harderian gland, vomeronasal organ, and major olfactory epithelium). These Abp paralogues differed dramatically in their specific expression in these different glands and in their sexual dimorphism of expression. We also studied the appearance of expression in both late-stage embryos and postnatal animals prior to puberty and found significantly different timing of the onset of expression among the various paralogues.
The multiple changes in the spatial expression profile of these genes resulting in various combinations of expression in glands and other organs in the head and face of the mouse strongly suggest that neofunctionalization of these genes, driven by adaptive evolution, has occurred following duplication. The extensive diversification in expression of this family of proteins provides two lines of evidence for a pheromonal role for ABP: 1) different patterns of Abpa/Abpbg expression in different glands; and 2) sexual dimorphism in the expression of the paralogues in a subset of those glands. These expression patterns differ dramatically among various glands that are located almost exclusively in the head and neck, where the sensory organs are located. Since mice are nocturnal, it is expected that they will make extensive use of olfactory as opposed to visual cues. The glands expressing Abp paralogues produce secretions (lacrimal and salivary) or detect odors (MOE and VNO) and thus it appears highly likely that ABP proteins play a role in olfactory communication.
Jeune asphyxiating thoracic dystrophy (JATD) is a rare, often lethal, recessively inherited chondrodysplasia characterised by shortened ribs and long bones, sometimes accompanied by polydactyly, and renal, liver and retinal disease. Mutations in intraflagellar transport (IFT) genes cause JATD, including the IFT dynein-2 motor subunit gene DYNC2H1. Genetic heterogeneity and the large DYNC2H1 gene size have hindered JATD genetic diagnosis.
Aims and methods
To determine the contribution to JATD we screened DYNC2H1 in 71 JATD patients JATD patients combining SNP mapping, Sanger sequencing and exome sequencing.
Results and conclusions
We detected 34 DYNC2H1 mutations in 29/71 (41%) patients from 19/57 families (33%), showing it as a major cause of JATD especially in Northern European patients. This included 13 early protein termination mutations (nonsense/frameshift, deletion, splice site) but no patients carried these in combination, suggesting the human phenotype is at least partly hypomorphic. In addition, 21 missense mutations were distributed across DYNC2H1 and these showed some clustering to functional domains, especially the ATP motor domain. DYNC2H1 patients largely lacked significant extra-skeletal involvement, demonstrating an important genotype–phenotype correlation in JATD. Significant variability exists in the course and severity of the thoracic phenotype, both between affected siblings with identical DYNC2H1 alleles and among individuals with different alleles, which suggests the DYNC2H1 phenotype might be subject to modifier alleles, non-genetic or epigenetic factors. Assessment of fibroblasts from patients showed accumulation of anterograde IFT proteins in the ciliary tips, confirming defects similar to patients with other retrograde IFT machinery mutations, which may be of undervalued potential for diagnostic purposes.
Clinical Genetics; Molecular Genetics; Developmental; Diagnostics; Genetic Screening/Counselling