One type of special proteins, called transcription factors (TFs), performs important functions by regulating the transcription of genes via physically interaction with certain DNA sequence patterns, or called motifs. To uncover the regulation mechanisms, one promising approach is to identify all cis-acting targets, or called binding sites, for a given TF in the genome scale, which is defined as the TF’s cistrome (
Carroll, et al., 2006;
Lupien, et al., 2008), and the popular technology to study cistrome is Chromatin Immunoprecipitation coupled with sequencing (ChIP-Seq) (
Johnson, et al., 2007). Briefly, in the ChIP step, DNA sequences are fragmentized into hundreds of base pairs, and fragments with certain TF binding are enriched through immunoprecipitation. The enriched DNA fragments are then sequenced using massively parallel DNA sequencing technology, with outputs called sequencing reads or tags. MACS was originally designed to give robust and high resolution peak identification for ChIP-Seq data with two main features (
Zhang, et al., 2008). Firstly, MACS empirically models the shift size of ChIP-Seq reads, and uses it to improve the spatial resolution of inferred TF binding sites. Secondly, MACS estimates a dynamic background reads distribution to effectively capture local biases in the genome, allowing for more robust identifications.
Besides cistrome studies, ChIP-Seq technology is also widely used to generate epigenome profiles, especially histone modification status (
Barski, et al., 2007;
Mikkelsen, et al., 2007). As different histone modifications have distinct effects on chromatin environments by altering the binding to DNA or providing recognition sites for chromatin effector modules, genome-wide ChIP-Seq approaches can dramatically increase the understanding on the relationships between specific histone modifications and gene regulation outcomes. Although the procedure to generate histone modification ChIP-Seq data is quite similar to that for cistrome, the distributions of sequencing reads for both cases are usually different. For most TFs, sequencing reads enriched regions are generally discrete, and typically they form sharp peaks along the genome. However for many types of histone modifications, the distribution of reads obeys a continuous property, as the epigenetic status of nearby nucleosomes tend to be similar, usually resulting in quite broad peaks. With proper parameter setting, MACS performs well to detect histone modification enriched regions. Similarly, MACS can also be applied in affinity enrichment based DNA methylation studies, such as MeDIP-Seq data.
This unit firstly describes the basic protocol of analyzing FoxA1 ChIP-Seq data in human MCF7 cell line. FoxA1 is a typical TF, which regulates gene expression as a pioneer factor (
Lupien, et al., 2008). This protocol contains a control sample. Another basic protocol analyzes H3K27me3 ChIP-Seq in mouse ES cell (
Mikkelsen, et al., 2007). H3K27me3 is a widely studied histone modification with broad peaks. Each protocol includes the required data to run MACS, the exact parameters with explanation and the understanding of MACS results. After the two basic protocols, the basic idea behind MACS is presented‥ Besides, a complete parameter list of MACS software with description is followed for user’s reference. The software is available via
http://liulab.dfci.harvard.edu/MACS/. It is written in Python and distributed under the terms of Artistic License. The latest version is 1.4.0beta. Specialized terms used in this unit are defined in .