The Andes-Amazon basin of Peru and Bolivia is one of the most data-poor, biologically rich, and rapidly changing areas of the world. Conservation scientists agree that this area hosts extremely high endemism, perhaps the highest in the world, yet we know little about the geographic distributions of these species and ecosystems within country boundaries. To address this need, we have developed conservation data on endemic biodiversity (~800 species of birds, mammals, amphibians, and plants) and terrestrial ecological systems (~90; groups of vegetation communities resulting from the action of ecological processes, substrates, and/or environmental gradients) with which we conduct a fine scale conservation prioritization across the Amazon watershed of Peru and Bolivia. We modelled the geographic distributions of 435 endemic plants and all 347 endemic vertebrate species, from existing museum and herbaria specimens at a regional conservation practitioner's scale (1:250,000-1:1,000,000), based on the best available tools and geographic data. We mapped ecological systems, endemic species concentrations, and irreplaceable areas with respect to national level protected areas.
We found that sizes of endemic species distributions ranged widely (< 20 km2 to > 200,000 km2) across the study area. Bird and mammal endemic species richness was greatest within a narrow 2500-3000 m elevation band along the length of the Andes Mountains. Endemic amphibian richness was highest at 1000-1500 m elevation and concentrated in the southern half of the study area. Geographical distribution of plant endemism was highly taxon-dependent. Irreplaceable areas, defined as locations with the highest number of species with narrow ranges, overlapped slightly with areas of high endemism, yet generally exhibited unique patterns across the study area by species group. We found that many endemic species and ecological systems are lacking national-level protection; a third of endemic species have distributions completely outside of national protected areas. Protected areas cover only 20% of areas of high endemism and 20% of irreplaceable areas. Almost 40% of the 91 ecological systems are in serious need of protection (= < 2% of their ranges protected).
We identify for the first time, areas of high endemic species concentrations and high irreplaceability that have only been roughly indicated in the past at the continental scale. We conclude that new complementary protected areas are needed to safeguard these endemics and ecosystems. An expansion in protected areas will be challenged by geographically isolated micro-endemics, varied endemic patterns among taxa, increasing deforestation, resource extraction, and changes in climate. Relying on pre-existing collections, publically accessible datasets and tools, this working framework is exportable to other regions plagued by incomplete conservation data.
Andes-Amazon; conservation planning; ecological systems; endemic species richness; irreplaceability; Latin America
The epigenome plays the pivotal role as interface between genome and environment. True genome-wide assessments of epigenetic marks, such as DNA methylation (methylomes) or chromatin modifications (chromatinomes), are now possible, either through high-throughput arrays or increasingly by second-generation DNA sequencing methods. The ability to collect these data at this level of resolution enables us to begin to be able to propose detailed questions, and interrogate this information, with regards to changes that occur due to development, lineage and tissue-specificity, and significantly those caused by environmental influence, such as ageing, stress, diet, hormones or toxins. Common complex traits are under variable levels of genetic influence and additionally epigenetic effect. The detection of pathological epigenetic alterations will reveal additional insights into their aetiology and how possible environmental modulation of this mechanism may occur. Due to the reversibility of these marks, the potential for sequence-specific targeted therapeutics exists. This review surveys recent epigenomic advances and their current and prospective application to the study of common diseases.
Genomics; epigenetics; epigenomics; common disease; complex traits; gene environment interaction
Cellular differentiation involves widespread epigenetic reprogramming, including modulation of DNA methylation patterns. Using Differential Methylation Hybridization (DMH) in combination with a custom DMH array containing 51,243 features covering more than 16,000 murine genes, we carried out a genome-wide screen for cell- and tissue-specific differentially methylated regions (tDMRs) in undifferentiated embryonic stem cells (ESCs), in in-vitro induced neural stem cells (NSCs) and 8 differentiated embryonic and adult tissues. Unsupervised clustering of the generated data showed distinct cell- and tissue-specific DNA methylation profiles, revealing 202 significant tDMRs (p<0.005) between ESCs and NSCs and a further 380 tDMRs (p<0.05) between NSCs/ESCs and embryonic brain tissue. We validated these tDMRs using direct bisulfite sequencing (DBS) and methylated DNA immunoprecipitation on chip (MeDIP-chip). Gene ontology (GO) analysis of the genes associated with these tDMRs showed significant (absolute Z score>1.96) enrichment for genes involved in neural differentiation, including, for example, Jag1 and Tcf4. Our results provide robust evidence for the relevance of DNA methylation in early neural development and identify novel marker candidates for neural cell differentiation.
Monozygotic (MZ) twin pair discordance for childhood-onset Type 1 Diabetes (T1D) is ∼50%, implicating roles for genetic and non-genetic factors in the aetiology of this complex autoimmune disease. Although significant progress has been made in elucidating the genetics of T1D in recent years, the non-genetic component has remained poorly defined. We hypothesized that epigenetic variation could underlie some of the non-genetic component of T1D aetiology and, thus, performed an epigenome-wide association study (EWAS) for this disease. We generated genome-wide DNA methylation profiles of purified CD14+ monocytes (an immune effector cell type relevant to T1D pathogenesis) from 15 T1D–discordant MZ twin pairs. This identified 132 different CpG sites at which the direction of the intra-MZ pair DNA methylation difference significantly correlated with the diabetic state, i.e. T1D–associated methylation variable positions (T1D–MVPs). We confirmed these T1D–MVPs display statistically significant intra-MZ pair DNA methylation differences in the expected direction in an independent set of T1D–discordant MZ pairs (P = 0.035). Then, to establish the temporal origins of the T1D–MVPs, we generated two further genome-wide datasets and established that, when compared with controls, T1D–MVPs are enriched in singletons both before (P = 0.001) and at (P = 0.015) disease diagnosis, and also in singletons positive for diabetes-associated autoantibodies but disease-free even after 12 years follow-up (P = 0.0023). Combined, these results suggest that T1D–MVPs arise very early in the etiological process that leads to overt T1D. Our EWAS of T1D represents an important contribution toward understanding the etiological role of epigenetic variation in type 1 diabetes, and it is also the first systematic analysis of the temporal origins of disease-associated epigenetic variation for any human complex disease.
Type 1 diabetes (T1D) is a complex autoimmune disease affecting >30 million people worldwide. It is caused by a combination of genetic and non-genetic factors, leading to destruction of insulin-secreting cells. Although significant progress has recently been made in elucidating the genetics of T1D, the non-genetic component has remained poorly defined. Epigenetic modifications, such as methylation of DNA, are indispensable for genomic processes such as transcriptional regulation and are frequently perturbed in human disease. We therefore hypothesized that epigenetic variation could underlie some of the non-genetic component of T1D aetiology, and we performed a genome-wide DNA methylation analysis of a specific subset of immune cells (monocytes) from monozygotic twins discordant for T1D. This revealed the presence of T1D–specific methylation variable positions (T1D–MVPs) in the T1D–affected co-twins. Since these T1D–MVPs were found in MZ twins, they cannot be due to genetic differences. Additional experiments revealed that some of these T1D–MVPs are found in individuals before T1D diagnosis, suggesting they arise very early in the process that leads to overt T1D and are not simply due to post-disease associated factors (e.g. medication or long-term metabolic changes). T1D–MVPs may thus potentially represent a previously unappreciated, and important, component of type 1 diabetes risk.
The major histocompatibility complex (MHC) is a group of genes with a variety of roles in the innate and adaptive immune responses. MHC genes form a genetically linked cluster in eutherian mammals, an organization that is thought to confer functional and evolutionary advantages to the immune system. The tammar wallaby (Macropus eugenii), an Australian marsupial, provides a unique model for understanding MHC gene evolution, as many of its antigen presenting genes are not linked to the MHC, but are scattered around the genome.
Here we describe the 'core' tammar wallaby MHC region on chromosome 2q by ordering and sequencing 33 BAC clones, covering over 4.5 MB and containing 129 genes. When compared to the MHC region of the South American opossum, eutherian mammals and non-mammals, the wallaby MHC has a novel gene organization. The wallaby has undergone an expansion of MHC class II genes, which are separated into two clusters by the class III genes. The antigen processing genes have undergone duplication, resulting in two copies of TAP1 and three copies of TAP2. Notably, Kangaroo Endogenous Retroviral Elements are present within the region and may have contributed to the genomic instability.
The wallaby MHC has been extensively remodeled since the American and Australian marsupials last shared a common ancestor. The instability is characterized by the movement of antigen presenting genes away from the core MHC, most likely via the presence and activity of retroviral elements. We propose that the movement of class II genes away from the ancestral class II region has allowed this gene family to expand and diversify in the wallaby. The duplication of TAP genes in the wallaby MHC makes this species a unique model organism for studying the relationship between MHC gene organization and function.
DNA methylation constitutes the most stable type of epigenetic modifications modulating the transcriptional plasticity of mammalian genomes. Using bisulfite DNA sequencing, we report high-resolution methylation reference profiles of human chromosomes 6, 20 and 22, providing a resource of about 1.9 million CpG methylation values derived from 12 different tissues. Analysis of 6 annotation categories, revealed evolutionary conserved regions to be the predominant sites for differential DNA methylation and a core region surrounding the transcriptional start site as informative surrogate for promoter methylation. We find 17% of the 873 analyzed genes differentially methylated in their 5′-untranslated regions (5′-UTR) and about one third of the differentially methylated 5′-UTRs to be inversely correlated with transcription. While our study was controlled for factors reported to affect DNA methylation such as sex and age, we did not find any significant attributable effects. Our data suggest DNA methylation to be ontogenetically more stable than previously thought.
Recent multi-dimensional approaches to the study of complex disease have revealed powerful insights into how genetic and epigenetic factors may underlie their aetiopathogenesis. We examined genotype-epigenotype interactions in the context of Type 2 Diabetes (T2D), focussing on known regions of genomic susceptibility. We assayed DNA methylation in 60 females, stratified according to disease susceptibility haplotype using previously identified association loci. CpG methylation was assessed using methylated DNA immunoprecipitation on a targeted array (MeDIP-chip) and absolute methylation values were estimated using a Bayesian algorithm (BATMAN). Absolute methylation levels were quantified across LD blocks, and we identified increased DNA methylation on the FTO obesity susceptibility haplotype, tagged by the rs8050136 risk allele A (p = 9.40×10−4, permutation p = 1.0×10−3). Further analysis across the 46 kb LD block using sliding windows localised the most significant difference to be within a 7.7 kb region (p = 1.13×10−7). Sequence level analysis, followed by pyrosequencing validation, revealed that the methylation difference was driven by the co-ordinated phase of CpG-creating SNPs across the risk haplotype. This 7.7 kb region of haplotype-specific methylation (HSM), encapsulates a Highly Conserved Non-Coding Element (HCNE) that has previously been validated as a long-range enhancer, supported by the histone H3K4me1 enhancer signature. This study demonstrates that integration of Genome-Wide Association (GWA) SNP and epigenomic DNA methylation data can identify potential novel genotype-epigenotype interactions within disease-associated loci, thus providing a novel route to aid unravelling common complex diseases.
Analysis across the genome of patterns of DNA methylation reveals a rich landscape of allele-specific epigenetic modification and consequent effects on allele-specific gene expression.
DNA methylation plays an important role in biological processes in human health and disease. Recent technological advances allow unbiased whole-genome DNA methylation (methylome) analysis to be carried out on human cells. Using whole-genome bisulfite sequencing at 24.7-fold coverage (12.3-fold per strand), we report a comprehensive (92.62%) methylome and analysis of the unique sequences in human peripheral blood mononuclear cells (PBMC) from the same Asian individual whose genome was deciphered in the YH project. PBMC constitute an important source for clinical blood tests world-wide. We found that 68.4% of CpG sites and <0.2% of non-CpG sites were methylated, demonstrating that non-CpG cytosine methylation is minor in human PBMC. Analysis of the PBMC methylome revealed a rich epigenomic landscape for 20 distinct genomic features, including regulatory, protein-coding, non-coding, RNA-coding, and repeat sequences. Integration of our methylome data with the YH genome sequence enabled a first comprehensive assessment of allele-specific methylation (ASM) between the two haploid methylomes of any individual and allowed the identification of 599 haploid differentially methylated regions (hDMRs) covering 287 genes. Of these, 76 genes had hDMRs within 2 kb of their transcriptional start sites of which >80% displayed allele-specific expression (ASE). These data demonstrate that ASM is a recurrent phenomenon and is highly correlated with ASE in human PBMCs. Together with recently reported similar studies, our study provides a comprehensive resource for future epigenomic research and confirms new sequencing technology as a paradigm for large-scale epigenomics studies.
Epigenetic modifications such as addition of methyl groups to cytosine in DNA play a role in regulating gene expression. To better understand these processes, knowledge of the methylation status of all cytosine bases in the genome (the methylome) is required. DNA methylation can differ between the two gene copies (alleles) in each cell. Such allele-specific methylation (ASM) can be due to parental origin of the alleles (imprinting), X chromosome inactivation in females, and other as yet unknown mechanisms. This may significantly alter the expression profile arising from different allele combinations in different individuals. Using advanced sequencing technology, we have determined the methylome of human peripheral blood mononuclear cells (PBMC). Importantly, the PBMC were obtained from the same male Han Chinese individual whose complete genome had previously been determined. This allowed us, for the first time, to study genome-wide differences in ASM. Our analysis shows that ASM in PBMC is higher than can be accounted for by regions known to undergo parent-of-origin imprinting and frequently (>80%) correlates with allele-specific expression (ASE) of the corresponding gene. In addition, our data reveal a rich landscape of epigenomic variation for 20 genomic features, including regulatory, coding, and non-coding sequences, and provide a valuable resource for future studies. Our work further establishes whole-genome sequencing as an efficient method for methylome analysis.
DNA methylation is an epigenetic mark linking DNA sequence and transcription regulation, and therefore plays an important role in phenotypic plasticity. The ideal whole genome methylation (methylome) assay should be accurate, affordable, high-throughput and agnostic with respect to genomic features. To this end, the methylated DNA immunoprecipitation (MeDIP) assay provides a good balance of these criteria. In this Methods paper, we present AutoMeDIP-seq, a technique that combines an automated MeDIP protocol with library preparation steps for subsequent second-generation sequencing. We assessed recovery of DNA sequences covering a range of CpG densities using in vitro methylated λ-DNA fragments (and their unmethylated counterparts) spiked-in against a background of human genomic DNA. We show that AutoMeDIP is more reliable than manual protocols, shows a linear recovery profile of fragments related to CpG density (R2 = 0.86), and that it is highly specific (>99%). AutoMeDIP-seq offers a competitive approach to high-throughput methylome analysis of medium to large cohorts.
DNA methylation; Automation; Whole genome; High-throughput sequencing; MeDIP
Diabetic nephropathy is a serious complication of diabetes mellitus and is associated with considerable morbidity and high mortality. There is increasing evidence to suggest that dysregulation of the epigenome is involved in diabetic nephropathy. We assessed whether epigenetic modification of DNA methylation is associated with diabetic nephropathy in a case-control study of 192 Irish patients with type 1 diabetes mellitus (T1D). Cases had T1D and nephropathy whereas controls had T1D but no evidence of renal disease.
We performed DNA methylation profiling in bisulphite converted DNA from cases and controls using the recently developed Illumina Infinium® HumanMethylation27 BeadChip, that enables the direct investigation of 27,578 individual cytosines at CpG loci throughout the genome, which are focused on the promoter regions of 14,495 genes.
Singular Value Decomposition (SVD) analysis indicated that significant components of DNA methylation variation correlated with patient age, time to onset of diabetic nephropathy, and sex. Adjusting for confounding factors using multivariate Cox-regression analyses, and with a false discovery rate (FDR) of 0.05, we observed 19 CpG sites that demonstrated correlations with time to development of diabetic nephropathy. Of note, this included one CpG site located 18 bp upstream of the transcription start site of UNC13B, a gene in which the first intronic SNP rs13293564 has recently been reported to be associated with diabetic nephropathy.
This high throughput platform was able to successfully interrogate the methylation state of individual cytosines and identified 19 prospective CpG sites associated with risk of diabetic nephropathy. These differences in DNA methylation are worthy of further follow-up in replication studies using larger cohorts of diabetic patients with and without nephropathy.
Allele-specific expression (ASE) is essential for normal development and many cellular processes but, if impaired, can result in disease. ASE is a feature of organisms with genomes consisting of more than one set of homologous chromosomes. The higher the number of chromosome sets (ploidy) per cell, the higher the potential complexity of ASE. Humans, for instance, are diploid (except germ cells, which are haploid), resulting in multiple possible expression states in time and space for each set of alleles. ASE is invoked and modulated by both genetic and epigenetic changes, affecting the underlying DNA sequence or chromatin of each allele, respectively. Although numerous methods have been developed to assay ASE, they usually require RNA to be available and are dependent upon genetic polymorphisms (such as single nucleotide polymorphisms (SNPs)) to differentiate between allelic transcripts. The rapid convergence to second-generation sequencing as the method of choice to examine genomic, epigenomic and transcriptomic data enables an integrated and more general approach to define and predict ASE, independent of SNPs. This 'Omni-Seq' approach has the potential to advance our understanding of the biology and pathophysiology of ASE-mediated processes by elucidating subtle combinatorial effects, leading to the accurate delineation of sub-phenotypes with consequential benefit for improved insight into disease etiology.
The genome of extraembryonic tissue, such as the placenta, is hypomethylated relative to that in somatic tissues. However, the origin and role of this hypomethylation remains unclear. The DNA methyltransferases DNMT1, -3A, and -3B are the primary mediators of the establishment and maintenance of DNA methylation in mammals. In this study, we investigated promoter methylation-mediated epigenetic down-regulation of DNMT genes as a potential regulator of global methylation levels in placental tissue. Although DNMT3A and -3B promoters lack methylation in all somatic and extraembryonic tissues tested, we found specific hypermethylation of the maintenance DNA methyltransferase (DNMT1) gene and found hypomethylation of the DNMT3L gene in full term and first trimester placental tissues. Bisulfite DNA sequencing revealed monoallelic methylation of DNMT1, with no evidence of imprinting (parent of origin effect). In vitro reporter experiments confirmed that DNMT1 promoter methylation attenuates transcriptional activity in trophoblast cells. However, global hypomethylation in the absence of DNMT1 down-regulation is apparent in non-primate placentas and in vitro derived human cytotrophoblast stem cells, suggesting that DNMT1 down-regulation is not an absolute requirement for genomic hypomethylation in all instances. These data represent the first demonstration of methylation-mediated regulation of the DNMT1 gene in any system and demonstrate that the unique epigenome of the human placenta includes down-regulation of DNMT1 with concomitant hypomethylation of the DNMT3L gene. This strongly implicates epigenetic regulation of the DNMT gene family in the establishment of the unique epigenetic profile of extraembryonic tissue in humans.
Development Differentiation/Tissue; DNA/Methylation; DNA/Methyltransferase; Epigenetics; Gene Transcription; Extraembryonic Tissue; Placenta; Trophoblast
Recent studies have shown that DNA methylation (DNAm) markers in peripheral blood may hold promise as diagnostic or early detection/risk markers for epithelial cancers. However, to date no study has evaluated the diagnostic and predictive potential of such markers in a large case control cohort and on a genome-wide basis.
By performing genome-wide DNAm profiling of a large ovarian cancer case control cohort, we here demonstrate that active ovarian cancer has a significant impact on the DNAm pattern in peripheral blood. Specifically, by measuring the methylation levels of over 27,000 CpGs in blood cells from 148 healthy individuals and 113 age-matched pre-treatment ovarian cancer cases, we derive a DNAm signature that can predict the presence of active ovarian cancer in blind test sets with an AUC of 0.8 (95% CI (0.74–0.87)). We further validate our findings in another independent set of 122 post-treatment cases (AUC = 0.76 (0.72–0.81)). In addition, we provide evidence for a significant number of candidate risk or early detection markers for ovarian cancer. Furthermore, by comparing the pattern of methylation with gene expression data from major blood cell types, we here demonstrate that age and cancer elicit common changes in the composition of peripheral blood, with a myeloid skewing that increases with age and which is further aggravated in the presence of ovarian cancer. Finally, we show that most cancer and age associated methylation variability is found at CpGs located outside of CpG islands.
Our results underscore the potential of DNAm profiling in peripheral blood as a tool for detection or risk-prediction of epithelial cancers, and warrants further in-depth and higher CpG coverage studies to further elucidate this role.
There are two main classes of natural killer (NK) cell receptors in mammals, the killer cell immunoglobulin-like receptors (KIR) and the structurally unrelated killer cell lectin-like receptors (KLR). While KIR represent the most diverse group of NK receptors in all primates studied to date, including humans, apes, and Old and New World monkeys, KLR represent the functional equivalent in rodents. Here, we report a first digression from this rule in lemurs, where the KLR (CD94/NKG2) rather than KIR constitute the most diverse group of NK cell receptors. We demonstrate that natural selection contributed to such diversification in lemurs and particularly targeted KLR residues interacting with the peptide presented by MHC class I ligands. We further show that lemurs lack a strict ortholog or functional equivalent of MHC-E, the ligands of non-polymorphic KLR in “higher” primates. Our data support the existence of a hitherto unknown system of polymorphic and diverse NK cell receptors in primates and of combinatorial diversity as a novel mechanism to increase NK cell receptor repertoire.
Most receptors of natural killer (NK) cells interact with highly polymorphic major histocompatibility complex (MHC) class I molecules and thereby regulate the activity of NK cells against infected or malignant target cells. Whereas humans, apes, and Old and New World monkeys use the family of killer cell immunoglobulin-like receptors (KIR) as highly diverse NK cell receptors, this function is performed in rodents by the diverse family of lectin-like receptors Ly49. When did this functional separation occur in evolution? We followed this by investigating lemurs, primates that are distantly related to humans. We show here that lemurs employ the CD94/NKG2 family as their highly diversified NK cell receptors. The CD94/NKG2 receptors also belong to the lectin-like receptor family, but are rather conserved in “higher” primates and rodents. We could further demonstrate that lemurs have a single Ly49 gene like other primates but lack functional KIR genes of the KIR3DL lineage and show major deviations in their MHC class I genomic organisation. Thus, lemurs have evolved a “third way” of polymorphic and diverse NK cell receptors. In addition, the multiplied lemur CD94/NKG2 receptors can be freely combined, thereby forming diverse receptors. This is, therefore, the first description of some combinatorial diversity of NK cell receptors.
MHC class I antigens are encoded by a rapidly evolving gene family comprising classical and non-classical genes that are found in all vertebrates and involved in diverse immune functions. However, there is a fundamental difference between the organization of class I genes in mammals and non-mammals. Non-mammals have a single classical gene responsible for antigen presentation, which is linked to the antigen processing genes, including TAP. This organization allows co-evolution of advantageous class Ia/TAP haplotypes. In contrast, mammals have multiple classical genes within the MHC, which are separated from the antigen processing genes by class III genes. It has been hypothesized that separation of classical class I genes from antigen processing genes in mammals allowed them to duplicate. We investigated this hypothesis by characterizing the class I genes of the tammar wallaby, a model marsupial that has a novel MHC organization, with class I genes located within the MHC and 10 other chromosomal locations.
Sequence analysis of 14 BACs containing 15 class I genes revealed that nine class I genes, including one to three classical class I, are not linked to the MHC but are scattered throughout the genome. Kangaroo Endogenous Retroviruses (KERVs) were identified flanking the MHC un-linked class I. The wallaby MHC contains four non-classical class I, interspersed with antigen processing genes. Clear orthologs of non-classical class I are conserved in distant marsupial lineages.
We demonstrate that classical class I genes are not linked to antigen processing genes in the wallaby and provide evidence that retroviral elements were involved in their movement. The presence of retroviral elements most likely facilitated the formation of recombination hotspots and subsequent diversification of class I genes. The classical class I have moved away from antigen processing genes in eutherian mammals and the wallaby independently, but both lineages appear to have benefited from this loss of linkage by increasing the number of classical genes, perhaps enabling response to a wider range of pathogens. The discovery of non-classical orthologs between distantly related marsupial species is unusual for the rapidly evolving class I genes and may indicate an important marsupial specific function.
Plasma concentrations of biologically active vitamin D
(1,25-(OH)2D) are tightly controlled via feedback regulation of
renal 1α-hydroxylase (CYP27B1; positive) and 24-hydroxylase
(CYP24A1; catabolic) enzymes. In pregnancy, this regulation is
uncoupled, and 1,25-(OH)2D levels are significantly elevated,
suggesting a role in pregnancy progression. Epigenetic regulation of
CYP27B1 and CYP24A1 has previously been described in cell
and animal models, and despite emerging evidence for a critical role of
epigenetics in placentation generally, little is known about the regulation of
enzymes modulating vitamin D homeostasis at the fetomaternal interface. In
this study, we investigated the methylation status of genes regulating vitamin
D bioavailability and activity in the placenta. No methylation of the
VDR (vitamin D receptor) and CYP27B1 genes was found in any
placental tissues. In contrast, the CYP24A1 gene is methylated in
human placenta, purified cytotrophoblasts, and primary and cultured chorionic
villus sampling tissue. No methylation was detected in any somatic human
tissue tested. Methylation was also evident in marmoset and mouse placental
tissue. All three genes were hypermethylated in choriocarcinoma cell lines,
highlighting the role of vitamin D deregulation in this cancer. Gene
expression analysis confirmed a reduced capacity for CYP24A1
induction with promoter methylation in primary cells and in vitro
reporter analysis demonstrated that promoter methylation directly
down-regulates basal promoter activity and abolishes vitamin D-mediated
feedback activation. This study strongly suggests that epigenetic decoupling
of vitamin D feedback catabolism plays an important role in maximizing active
vitamin D bioavailability at the fetomaternal interface.
The proteins encoded by the classical HLA class I and class II genes in the major histocompatibility complex (MHC) are highly polymorphic and play an essential role in self/non-self immune recognition. HLA variation is a crucial determinant of transplant rejection and susceptibility to a large number of infectious and autoimmune disease1. Yet identification of causal variants is problematic due to linkage disequilibrium (LD) that extends across multiple HLA and non-HLA genes in the MHC2,3. We therefore set out to characterize the LD patterns between the highly polymorphic HLA genes and background variation by typing the classical HLA genes and >7,500 common single nucleotide polymorphisms (SNPs) and deletion/insertion polymorphisms (DIPs) across four population samples. The analysis provides informative tag SNPs that capture some of the variation in the MHC region and that could be used in initial disease association studies, and provides new insight into the evolutionary dynamics and ancestral origins of the HLA loci and their haplotypes.
Meiotic recombination between highly-similar duplicated sequences (non-allelic homologous recombination, NAHR) generates deletions, duplications, inversions, and translocations, and is responsible for genetic diseases known as ‘genomic disorders’, most of which are caused by altered copy number of dosage sensitive genes. NAHR Hotspots have been identified within some duplicated sequences. We have developed sperm-based assays to measure the de novo rate of reciprocal deletions and duplications at 4 NAHR hotspots. We used these assays to dissect the relative rates of NAHR between different pairs of duplicated sequences. We show that: (i) these NAHR hotspots are specific to meiosis, (ii) deletions are generated at a higher rate than their reciprocal duplications in the male germline and (iii) some of these genomic disorders are likely to have been under-ascertained clinically, most notably the duplication of 7q11, the reciprocal of the Williams-Beuren Syndrome deletion.
The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence.
Numerous genetic association studies have implicated the KIAA0319 gene on human chromosome 6p22 in dyslexia susceptibility. The causative variant(s) remains unknown but may modulate gene expression, given that (1) a dyslexia-associated haplotype has been implicated in the reduced expression of KIAA0319, and (2) the strongest association has been found for the region spanning exon 1 of KIAA0319. Here, we test the hypothesis that variant(s) responsible for reduced KIAA0319 expression resides on the risk haplotype close to the gene's transcription start site. We identified seven single-nucleotide polymorphisms on the risk haplotype immediately upstream of KIAA0319 and determined that three of these are strongly associated with multiple reading-related traits. Using luciferase-expressing constructs containing the KIAA0319 upstream region, we characterized the minimal promoter and additional putative transcriptional regulator regions. This revealed that the minor allele of rs9461045, which shows the strongest association with dyslexia in our sample (max p-value = 0.0001), confers reduced luciferase expression in both neuronal and non-neuronal cell lines. Additionally, we found that the presence of this rs9461045 dyslexia-associated allele creates a nuclear protein-binding site, likely for the transcriptional silencer OCT-1. Knocking down OCT-1 expression in the neuronal cell line SHSY5Y using an siRNA restores KIAA0319 expression from the risk haplotype to nearly that seen from the non-risk haplotype. Our study thus pinpoints a common variant as altering the function of a dyslexia candidate gene and provides an illustrative example of the strategic approach needed to dissect the molecular basis of complex genetic traits.
Dyslexia, or reading disability, is a common disorder caused by both genetic and environmental factors. Genetic studies have implicated a number of genes as candidates for playing a role in dyslexia. We functionally characterized one such gene (KIAA0319) to identify variant(s) that might affect gene expression and contribute to the disorder. We discovered a variant residing outside of the protein-coding region of KIAA0319 that reduces expression of the gene. This variant creates a binding site for the transcription factor OCT-1. Previous studies have shown that OCT-1 binding to a specific DNA sequence upstream of a gene can reduce the expression of that gene. In this case, reduced KIAA0319 expression could lead to improper development of regions of the brain involved in reading ability. This is the first study to identify a functional variant implicated in dyslexia. More broadly, our study illustrates the steps that can be utilized for identifying mutations causing other complex genetic disorders.
DNA methylation is an indispensible epigenetic modification of mammalian genomes. Consequently there is great interest in strategies for genome-wide/whole-genome DNA methylation analysis, and immunoprecipitation-based methods have proven to be a powerful option. Such methods are rapidly shifting the bottleneck from data generation to data analysis, necessitating the development of better analytical tools. Until now, a major analytical difficulty associated with immunoprecipitation-based DNA methylation profiling has been the inability to estimate absolute methylation levels. Here we report the development of a novel cross-platform algorithm – Bayesian Tool for Methylation Analysis (Batman) – for analyzing Methylated DNA Immunoprecipitation (MeDIP) profiles generated using arrays (MeDIP-chip) or next-generation sequencing (MeDIP-seq). The latter is an approach we have developed to elucidate the first high-resolution whole-genome DNA methylation profile (DNA methylome) of any mammalian genome. MeDIP-seq/MeDIP-chip combined with Batman represent robust, quantitative, and cost-effective functional genomic strategies for elucidating the function of DNA methylation.
Smoking behavior has been associated in two independent European cohorts with the most common Caucasian human leukocyte antigen (HLA) haplotype (A1-B8-DR3). We aimed to test whether polymorphic members of the two odorant receptor (OR) clusters within the extended HLA complex might be responsible for the observed association, by genotyping a cohort of Hungarian women in which the mentioned association had been found. One hundred and eighty HLA haplotypes from Centre d’Etude du Polymorphisme Humain families were analyzed in silico to identify single-nucleotide polymorphisms (SNPs) within OR genes that are in linkage disequilibrium with the A1-B8-DR3 haplotype, as well as with two other haplotypes indirectly linked to smoking behavior. A nonsynonymous SNP within the OR12D3 gene (rs3749971T) was found to be linked to the A1-B8-DR3 haplotype. This polymorphism leads to a 97Thr → Ile exchange that affects a putative ligand binding region of the OR12D3 protein. Smoking was found to be associated in the Hungarian cohort with the rs3749971T allele (p = 1.05×10−2), with higher significance than with A1-B8-DR3 (p = 2.38×10−2). Our results link smoking to a distinct OR allele, and demonstrate that the rs3749971T polymorphism is associated with the HLA haplotype-dependent differential recognition of cigarette smoke components, at least among Caucasian women.
The imprinted insulin-like growth factor 2 (IGF2) gene is expressed predominantly from the paternal allele. Loss of imprinting (LOI) associated with hypomethylation at the promoter proximal sequence (DMR0) of the IGF2 gene was proposed as a predisposing constitutive risk biomarker for colorectal cancer. We used pyrosequencing to assess whether IGF2 DMR0 methylation is either present constitutively prior to cancer or whether it is acquired tissue-specifically after the onset of cancer. DNA samples from tumour tissues and matched non-tumour tissues from 22 breast and 42 colorectal cancer patients as well as peripheral blood samples obtained from colorectal cancer patients [SEARCH (n=case 192, controls 96)], breast cancer patients [ABC (n=case 364, controls 96)] and the European Prospective Investigation of Cancer [EPIC-Norfolk (n=breast 228, colorectal 225, controls 895)] were analysed. The EPIC samples were collected 2–5 years prior to diagnosis of breast or colorectal cancer. IGF2 DMR0 methylation levels in tumours were lower than matched non-tumour tissue. Hypomethylation of DMR0 was detected in breast (33%) and colorectal (80%) tumour tissues with a higher frequency than LOI indicating that methylation levels are a better indicator of cancer than LOI. In the EPIC population, the prevalence of IGF2 DMR0 hypomethylation was 9.5% and this correlated with increased age not cancer risk. Thus, IGF2 DMR0 hypomethylation occurs as an acquired tissue-specific somatic event rather than a constitutive innate epimutation. These results indicate that IGF2 DMR0 hypomethylation has diagnostic potential for colon cancer rather than value as a surrogate biomarker for constitutive LOI.
The major histocompatibility complex (MHC) is essential for human immunity and is highly associated with common diseases, including cancer. While the genetics of the MHC has been studied intensively for many decades, very little is known about the epigenetics of this most polymorphic and disease-associated region of the genome.
To facilitate comprehensive epigenetic analyses of this region, we have generated a genomic tiling array of 2 Kb resolution covering the entire 4 Mb MHC region. The array has been designed to be compatible with chromatin immunoprecipitation (ChIP), methylated DNA immunoprecipitation (MeDIP), array comparative genomic hybridization (aCGH) and expression profiling, including of non-coding RNAs. The array comprises 7832 features, consisting of two replicates of both forward and reverse strands of MHC amplicons and appropriate controls.
Using MeDIP, we demonstrate the application of the MHC array for DNA methylation profiling and the identification of tissue-specific differentially methylated regions (tDMRs). Based on the analysis of two tissues and two cell types, we identified 90 tDMRs within the MHC and describe their characterisation.
A tiling array covering the MHC region was developed and validated. Its successful application for DNA methylation profiling indicates that this array represents a useful tool for molecular analyses of the MHC in the context of medical genomics.