Recent years have seen remarkable progress in understanding of human genetics, enabled by the availability of the human genome sequence and increasingly high-throughput technologies for DNA analysis1. Yet despite their breadth and comprehensiveness, purely DNA sequence–level investigations do not shed light on a crucial component of human biology: how the same genome sequence can give rise to over 200 different cell types through remarkably consistent differentiation programs. This process of developmental specification, classically termed `epigenesis', is now known to involve differential regulation of genes and their products2. Aberrant regulation of such phenomena has been extensively linked to human diseases and, additionally, can be influenced by environmental inputs3–5.
Gene regulation and genome function are intimately related to the physical organization of genomic DNA and in particular to the way it is packaged into chromatin, a complex nucleoprotein structure comprising histones, DNA binding factors, accessory protein complexes and noncoding RNAs6–9 (Fig. 1). Chromatin is a dynamic entity that is subject to modification of both its DNA and protein components, with direct structural and functional consequences. The term `epigenome' is used to describe the way in which these modifications and structural features are distributed across the genome in a given cell population. The epigenomic landscapes and the associated gene expression programs are maintained within a given cell lineage through complex processes that involve transcription factors, chromatin regulators, histone modifications and variants, and RNAs10–12, but that remain poorly understood in mammals.
Although the mechanisms remain obscure, a now overwhelming body of evidence supports central roles for epigenomic changes in disease susceptibility and pathogenesis. Multiple disease processes, including cancer, are now well known to be associated with characteristic alterations in the patterns of chromatin, DNA methylation and gene expression3,5. In addition, epidemiological studies have linked early environmental exposures, such as in utero starvation, to long-term health consequences ranging from metabolic disorders to psychiatric diseases13. A causal role for epigenomic aberrations is supported by several lines of evidence, including mutations of genes encoding chromatin regulators in developmental disorders and cancer4,14–16, and by the therapeutic efficacy of small-molecule inhibitors of DNA methyltransferases and histone-modifying enzymes17.
Major epigenomic features can now be interrogated comprehensively by combining cellular, biochemical and molecular techniques with high-throughput sequencing. Production of genome-wide maps of cytosine methylation, histone modifications, chromatin accessibility and RNA transcripts represents a powerful and general approach for surveying the regulatory state of the genome in a cell type of interest. The resulting data define the locations and activation states of diverse functional elements, including genes and their transcriptional control elements (e.g., promoters, enhancers and insulators), noncoding transcripts and epigenetic effectors, such as imprinting control regions18–25. More globally, such maps can provide insight into developmental state and potential, for example of a stem cell population, and shed light on aberrant regulatory programs in diseased tissues.
Here we describe the aims and scope of the US National Institutes of Health (NIH) Roadmap Epigenomics Mapping Consortium, which has set out to provide a publicly accessible resource of epigenomic maps in stem cells and primary ex vivo tissues. These maps will detail the genome-wide landscapes of DNA methylation, histone modifications and related chromatin features, and are intended to provide a reference for studies of the genetic and epigenetic events that underlie human development, diversity and disease. Below, we describe the organizational structure, goals and anticipated deliverables of the consortium.