We have used array-based DNaseI hypersensitive site mapping to identify a novel regulatory element 45 kb downstream of the mouse Scl promoter 1a. We have shown that this element binds CTCF using both in vivo and in vitro assays, and that protein-DNA contacts extend to 57 base pairs around a CTCF-consensus binding site. Furthermore, we have shown that this element functions as an insulator in enhancer-blocking assays when tested in haematopoietic cell lines and we have used a novel transgenic reporter assay to demonstrate that the element can both enhance the efficiency and specificity of midbrain-specific expression compared to conventional transgenic analysis. Moreover, analysis of genome-wide data from the ENCODE consortium confirmed that the +45 element is a ubiquitous CTCF-bound insulator region. This element is identified as a DNaseI HSS bound by CTCF in a variety of Scl/Map17 expressing and non-expressing tissues, including haematopoietic tissues (bone marrow and spleen), brain, kidney, liver, heart and lung.
The spatial location of this Scl insulator in relation to neighbouring genes has clear similarities with other recently identified insulator sequences. Mapping genome-wide DNase I sites and chromatin structural elements has confirmed that many genes appear to be flanked by HSS that appear ubiquitous in a range of cell types. Data from high throughput protocols have shown that 86% of ubiquitous DNaseI HSS are within 2 kb of a transcription start site (TSS), and of the remaining ubiquitous DNase HSS which are distal to TSS, over 70% have been shown to bind CTCF in ChIP-chip assays. Furthermore, analysis of the recently identified CTCF consensus binding site, shared by this Scl insulator, has shown that more than 80% of evolutionary conserved sites are greater than 10 kb distal to a TSS. In this study, we have used DNase-Chip technology to show that the +45 element is a DNase HSS in both Scl-expressing and Scl-non-expressing cells, and using a range of in vitro
and in vivo
assays, we have shown that it has CTCF-dependent insulator function. ChIP-chip assays for CTCF in human cells have now also identified the equivalent human sequence as a CTCF binding site 
Several insulators have been identified to date, the chicken 5′ HSIV being the first identified in vertebrates 
and by far the best characterized. This element possesses both enhancer-blocking and barrier activity, the hallmarks of an insulator element. Whereas the enhancer-blocking activity is exclusively dependent of CTCF binding, its barrier activity is dependent of the binding of transcription factors like USF1 and USF2 
. A second insulator has been identified at the 3′ end of the chicken ß-globin locus, consisting exclusively of a CTCF binding site and possessing exclusively enhancer-blocking activity 
. The different functions of these two insulators reflect the status of the adjacent chromatin. The presence of heterochromatin 5′ of the chicken ß-globin locus requires a boundary element to prevent it from spreading over the locus. The olfactory receptors present 3′ of the ß-globin genes are not expressed in erythroid cells but are expressed in other tissues, therefore the 3′ insulator only need to possess enhancer-blocking activity. Several other insulators with exclusively enhancer-blocking activity have been identified like Igf2/H19 ICM which contains four CTCF binding sites 
. To investigate which type of function is performed by the +45 element we perform ChIP-chip for different histone modifications characteristic of active (H3K9Ac) and repressive (H3K27me3) chromatin. In expressing cells (416B) the Scl/Map17 locus is clearly marked by H3K9Ac, not present over the Cyp4X1, while in non-expressed cells it is marked by the repressive mark H3K27me3. These experiments indicate that the +45 element is located at the boundary between chromatin domains, where the Scl/Map17 co-regulatory domain abuts the Cyp gene family.
Until recently, there has been no clear consensus for the genomic sequence of the CTCF transcription factor-binding site. However, using high throughput techniques a new consensus CTCF sequence has been proposed. Intriguingly, a number of well-characterised insulators such as the chicken HSIV, the mouse ß-globin HS2 and the human myb HS2 have relatively little sequence similarity 
, and display only partial similarities to the newly identified consensus sequence. In contrast, the +45 CTCF site that we have functionally mapped in this study shows tight sequence conservation with the new CTCF-consensus sequence, yet was not identified through CTCF binding site consensus mapping. This study therefore provides an independent functional verification of the new CTCF consensus sequence.
The CTCF protein is a large 11 zinc-finger complex that has been shown to contact DNA over many base pairs at other previously-mapped insulator elements 
. In this study we have, for the first time, used in vivo
DMS footprinting to map the extent of the protein-DNA interactions at a CTCF-dependent insulator. This shows that in vivo
, the protein-DNA interaction extends to 57 base pairs of DNA, i.e. beyond the CTCF-consensus binding sequence, suggesting that the structure of DNA surrounding the consensus binding site is likely to play a key role in the CTCF protein-DNA interaction. Moreover we demonstrate that the 57 bp core fragment is both necessary and sufficient for insulator function in vitro
In order to further dissect the +45 insulator function we performed the classic enhancer-blocking assay in haematopoietic cell lines. Introduction of the full +45 element, the 350 bp core region or the 57FPR between the enhancer and the Neo reporter gene lead to a reduction in Neo resistant colonies, similarly to the 1.2 kb cHSIV insulator. The number of colonies was unaltered or even increased when the CTCF binding site in the +45 element was deleted or mutated (del350, 350del57FPR and 350mutCTCF) indicating the CTCF-dependent enhancer-blocking activity. This may be explained by the fact that different cell lines were used for these assays (murine 416B and human K562). To check the insulator function of the +45 element in vivo
, we elected to develop a transgenic mouse reporter model of insulator function. Using this approach we have demonstrated that the +45 insulator can be used to improve both the frequency of transgene expression (i.e. the proportion of transgenic embryos which express LacZ) and reduces ectopic transgene expression. The Scl promoters drive expression to the brain but not to haematopoietic tissues 
. Flanking the Scl promoter driven Lacz reporter gene cassette with the +45 element very significantly reduced reporter gene expression in tissues other than brain, the staining pattern expected for the Scl promoter. The discrepancy of tissue types used for the in vitro
(haematopoietic) and in vivo
(brain) assays was necessitated by the restricted in vivo
activity of Scl and Map17 promoters, none of which drive expression in haematopoietic cells in viv
o when assayed in transgenic mice. Taken together, these data raise the possibility that the +45 element could be a useful tool allowing effective insulation of transgenic constructs from adjacent chromatin.
In conclusion, we have presented data that identifies and functionally defines a putative novel insulator sequence located 45 kb upstream of the Scl transcription start site at the boundary of chromatin domains. Our results suggest that the +45 element functions as a CTCF-dependent insulator, that may prevent inappropriate activation both of the Cyp genes during haematopoietic differentiation and also of Scl and Map17 during hepatic development.