Genome-wide analysis of the EspR regulon
We investigated EspR-binding to the chromosome of Mtb
strain H37Rv during exponential growth by ChIP-Seq, chromatin immunoprecipitation followed by ultra-high throughput DNA sequencing 
. Sequence reads obtained from two independent ChIP-Seq experiments using EspR-specific antibodies were mapped to the Mtb
H37Rv genome. Based on the peak detection criteria, we identified 165 enriched loci harboring 582 EspR-binding peaks (, Table S2
), that were enriched by >1.5-fold and these were not present in ChIP-Seq datasets from control experiments conducted without antibody or with unrelated antibodies (data not shown). These 165 loci occurred across the genome ( and ) and were sited both in intergenic regions (45%) and within genes (55%) implying that EspR is not a classical transcriptional regulator.
Genome-wide mapping of EspR binding sites.
Binding of EspR to the Mtb chromosome.
Diverse functions are encoded by genes where EspR bound upstream and classification by functional category reveals over-representation of cell wall/cell processes and the surface-exposed PE/PPE proteins (http://tuberculist.epfl.ch/
; ). Internal sites were found within AT-rich genes encoding proteins belonging to the PPE family (), like ppe24
), and some of these, such as ppe58
), also bind EspR at their 5′-ends. Binding sites were present within genes that are thought to have been acquired by horizontal transfer 
like the rv0986-rv0989c
A survey of the ten top scoring peaks () highlighted the major EspR-binding gene targets. Two of the top three sites () occurred at a locus encoding an enzyme system that produces the complex lipids phthiocerol dimycocerosate (PDIM) and phenolic glycolipid (PGL) 
. The second highest scoring site overlaps the translational start of rv1490
, which encodes a membrane protein of unknown function, and this was followed by three other peaks of lower intensity, separated by ~300–400 bp, spread across rv1490
(). The fourth and eighth highest scoring sites affect two genes, pe-pgrs19
, encoding mycobacteria-restricted PE_PGRS proteins, while the espACD
locus, which is preceded by three EspR-binding sites (), occurred in the fifth position of the top ten ChIP-Seq hits (). The ninth peak is sited in the intergenic region between lipF
), encoding a lipid esterase, and rv3488
, whereas the last peak of the top ten ChIP-Seq list was found at the 3′-end of fadB2
) encoding a beta-hydroxybutyryl-CoA dehydrogenase and upstream of umaA
) coding for a mycolic acid synthase. EspR binds to multiple sites in the ESX-1, ESX-2 and ESX-5 loci (Fig. S2
), as well as to two sites upstream of its own gene (), thus implying autogenous control. Taken together, these data suggest that EspR may be involved in regulating cell wall function.
Top 10 EspR binding loci from ChIP-Seq.
Confirmation of in vivo EspR binding
To obtain independent confirmation for selected parts of the in vivo
dataset, we initially focused on the EspR-dependent espACD
. Our previous in vitro
work revealed two EspR binding sites separated by 19 bp and located between 506 and 444 bp upstream of espACD
, consistent with the presence of a ChIP-Seq peak in this region (). On closer inspection, two additional major peaks of EspR-enrichment were found further upstream of espA
(centered between −857 bp and −695 bp and between −1214 bp and −1113 bp, respectively). While this work was in progress, another report of the presence of two additional sites upstream of espACD
. The existence of these sites also corroborates results we obtained previously using AFM to visualize nucleoprotein complexes of EspR and a 1360 bp espACD
promoter fragment 
. AFM revealed loop structures stabilized by multiple EspR dimer of dimers suggesting the presence of several distant EspR binding sites in the espACD
upstream region. The 5′-end of the espA
mRNA was located 66 bp upstream of the translation start codon using 5′ RACE (Fig. S3
). Consequently, the nearest EspR binding site is positioned over 300 bp upstream of the promoter.
To further validate EspR-binding peaks, with varying degrees of enrichment, we performed ChIP followed by quantitative PCR on 11 selected sites (four located within intergenic regions, three within ORFs and three overlapping a translational start) and two non-peak regions (within rv0888
ORF) as controls. All of the selected EspR-binding regions exhibited enrichment comparable to that observed from ChIP-Seq analysis (Fig. S4
), thus confirming that all peaks were genuine EspR-targets.
To obtain further confirmation of the in vivo
EspR binding sites, we performed electrophoretic mobility shift assays (EMSAs) using ~100 bp DNA sequences covering the top five binding sites (Fig. S5
and ) and a DNA fragment of the same size from within the espA
ORF as a negative control. EspR was shown to bind to all five sites in a concentration dependent manner, while the negative control fragment remained unbound at an equal protein concentration. However, clear differences in affinity between the fragments were visible. For example, the top-scoring fadD26
peak bound EspR less strongly compared to the four others suggesting that other determinants, like long-range protein-protein or protein-DNA interactions, could contribute to the high-affinity binding observed in vivo
Target gene regulation on EspR binding
To confirm the prediction that binding of EspR directly affects target gene expression, we exploited a pristinamycin-inducible system 
to overexpress espR
conditionally in Mtb
(strain H37Rv::pMYespR; ). Compared to the controls, it is noteworthy that espR
over-expression significantly decreased growth after 24 h () while espR
transcript and EspR protein levels were found to be ~8-fold and ~3-fold higher than in the control after 72 h, respectively (). When the relative amounts of target transcripts in untreated and pristinamycin IA-treated H37Rv::pMYespR cells were measured by quantitative RT-PCR, significantly increased transcript levels were detected for rv1490
, and the ABC-transporter rv0986
(). Conversely, repression of lipF
transcription was also observed upon EspR overexpression, whereas transcription of some target genes appeared unchanged (). Using a discriminatory RT-PCR assay it was possible to measure the impact of EspR overproduction on expression of the chromosomal copy of espR
and, again, this appeared to act negatively ().
Gene expression associated with EspR binding.
The combined findings suggest that EspR is capable of both positive and negative transcriptional regulation. Moreover, the inability to observe direct EspR-dependent regulation at some major EspR binding sites suggests that EspR has no or little effect on these genes in the conditions tested or that other regulators counter-balance the effect of increased EspR levels.
EspR is not secreted
To determine if the low intracellular levels of EspR observed at the early and mid-log phases of growth were due to intensive EspR secretion, we measured intra- and extra-cellular levels of EspR from strains Mtb H37Rv and Mtb H37RvΔRD1 cultured in Sauton's medium to mid-log phase. Under these conditions, we were unable to detect EspR among the culture filtrate (CF) proteins in either case, whereas EsxA was present in the CF of Mtb H37Rv, as expected, but not in CF from the ESX-1 mutant H37RvΔRD1 that lacks esxA among other genes ().
To investigate whether EspR was exported from the cytosol but retained in the cell envelope, whole cell lysate (CL) was fractionated by ultracentrifugation into the cell wall/cell membrane (W/M) and cytosolic (CYT) components. Since the chromosome is known to be attached to the plasma membrane 
, half of the samples were treated with DNase I. EspR was detected in both of the untreated fractions but was mainly in the cytosol after DNase I treatment (). Since previous studies were performed with the Erdman strain of Mtb
, this provided a possible explanation for the localization discrepancy. Consequently, we repeated the experiment with the Mtb
Erdman strain and the ESX-1 mutant Mtb
Erdman 36–72 that fails to secrete EsxA 
. Again, EspR was below the level of detection in the CF of either strain, whereas EsxA appeared in the CF of Mtb
Erdman (Fig. S7
). We then examined CF at different time points of Erdman cultures for the presence of EspR and the cytosolic marker GroEL2. EspR first appeared in the culture filtrate after 8 days of growth when it was accompanied by GroEL2, indicating that cell lysis had likely occurred ().
The EspR protein has attracted considerable interest because of its role in the regulation of virulence in Mtb
and the remarkable, and most unusual, property of being secreted by the very secretion system ESX-1 whose expression it controls 
. This has led to the suggestion that a negative feedback loop modulates EspR secretion and is critical to successful infection. Studies of the three-dimensional structure of EspR and its truncated variant, EspRΔ10, together with models of DNA recognition and AFM analysis of single molecule EspR nucleoprotein complexes, indicate that EspR employs an atypical DNA recognition mechanism 
. A dimer of dimers is thought to bind to DNA via one monomer of each dimer leaving the second monomer free to contact another site. Cooperative interactions then lead to multimerization and the formation of looped structures in which EspR acts as a bridge between two separated, or even remote, sites on DNA 
. Such structures are typically formed by nucleoid-associated proteins (NAPs), like H-NS and Fis 
. Collectively, these features led us to consider the possibility that EspR functions as a NAP rather than as a specific transcriptional activator of a limited number of genes required for pathogenesis.
To test this possibility, ChIP-Seq analysis was performed to assess the genome-wide distribution of EspR and to identify the sites and genes to which it binds. This resulted in the identification of at least 165 loci, often containing multiple peaks of EspR-binding, throughout the genome. Binding sites were distributed more or less evenly between intragenic and intergenic regions. The majority of the genes (~50%) were involved in cell envelope functions. Among them were genes that contribute to ESX-1, ESX-2 and ESX-5-related activities, and others that contribute to mycobacterial virulence, such as lipF and the PDIM locus. Surprisingly, the latter harbored the major peak of EspR binding in the genome whereas the previously reported site preceding the espACD operon was less prominent ( and ).
Confirmation of EspR binding to a selection of major sites was obtained in vitro
, from a combination of studies performed with highly purified EspR (), and in vivo
, following overexpression of the protein (). This resulted in the definition of a consensus sequence, TTTGC
], that agrees well with the motif predicted previously by molecular dynamic simulations, involving computation of the binding energy of the optimal interaction of EspR with the intergenic region upstream of espA
. These predictions were consistent with the findings of DNase footprinting or EMSA studies of the same locus. Using an in silico
approach to scan the genome sequence >1,000 potential EspR sites were found, of which 163 had been detected experimentally by ChIP-Seq. While some of the in silico
predictions may be fortuitous, this does raise the possibility that occupancy of EspR-binding sites may vary with growth phase or physiological conditions and that more sites will be uncovered. Furthermore, the EspR-binding motif deduced here should be considered as a core sequence for high affinity nucleation sites from where cooperative binding between EspR dimers can initiate and extend to form long oligomers and hence reach more distant sites 
. The number of such sites and the distance between them most probably enables EspR to structure the chromosome.
The number of EspR molecules per cell was estimated by quantitative Western blotting at different stages of the growth cycle. There was a steady increase in concentration until ~100,000 molecules/cell were found at day 5. These levels are about 30-fold higher than those of well-characterized transcriptional activators, like Fnr in E. coli
, but are comparable to levels of major NAPs, such as Fis or HU, during the exponential growth phase of E. coli
. The intracellular EspR concentration is clearly in excess of that required to occupy all the experimentally detected (582) or computer predicted (1026) binding sites.
The results of subcellular fractionation of Mtb
H37Rv cells from mid-log phase indicate that EspR is predominantly a cytosolic protein although it can be found attached to cell membrane-bound DNA, a trait of the nucleoid. Prior treatment of this fraction with DNase I releases most of the EspR to the cytosol (). In contrast to the findings of a previous report 
, we were unable to detect the secreted form of EspR in the culture filtrates of ESX-1 proficient and deficient strains of H37Rv nor in the Erdman strains at early time points. Since EspR is a relatively abundant protein and only found in the culture filtrate together with the cytosolic marker GroEL2, we conclude that it is released via cell lysis rather than secretion mediated by ESX-1.
EspR seemingly acts as an activator or a repressor depending on its binding position relative to the genes it controls. EspR production appears to be autoregulated as the protein binds to its own promoter region and downregulates expression in certain conditions (). Interaction at the espR
regulatory site occurs in a contrasting manner to that seen at the espACD
locus where there are three prominent binding sites situated far upstream of the promoter (). Interestingly, some tubercle bacilli, including M. bovis
and M. microti
, have incurred the RD8 deletion in this region of the genome 
that completely removes the three major EspR-binding sites. Two EspR-binding sites flank the espR
promoter thus evoking an autoregulatory mechanism whereby EspR forms a loop at the promoter that either occludes RNA polymerase (RNAP) or traps RNAP that has already bound. In ChIP-Seq experiments performed with RNAP-antibodies, a major polymerase binding site was localized (data not shown) that partially overlaps EspR peak a
(), consistent with the promoter prediction from 5′ RACE.
Loss of EspR strongly attenuates Mtb
, suggesting that this is due to reduced functioning of the ESX-1 system as a result of insufficient EspA levels. However, in light of the present findings, this appears to be an oversimplification as expression of the genes for several other known virulence determinants are clearly subject to EspR regulation. Foremost among these is a major locus that encodes an enzyme system required for synthesis of PDIM, and in some strains PGL, both of which contribute extensively to virulence 
. Another enzyme that has an important role in pathogenesis is the lipase, LipF 
, which has been implicated in modification of the mycobacterial cell wall as an adaptive response to acid damage 
. LipF is also thought to degrade host lipids during infection 
. The EspR binding site () located far upstream of the lipF
coding sequence overlaps the previously identified 59 bp acid-inducible promoter region, situated 515 bp from the start codon 
. Occlusion of this site by EspR would therefore explain the observed repression of lipF
transcription (). The lipF
gene, together with a number of other EspR gene targets like fadD26
, is also regulated by PhoP 
and CRP 
, frequently with opposite effects on transcription.
Regulation of transcription orchestrated by EspR seems to occur at two levels. EspR binding at promoter regions, as in the case of espR
, resembles global transcriptional regulators where repression of transcription stems from occlusion of RNAP whereas activation of transcription occurs via favorable interaction with RNAP and/or other proteins. On the other hand, our genome-wide analysis revealed that more than half of the EspR-binding sites are intragenic and this refutes, at least partially, the hypothesis that EspR acts as a transcription factor per se
. Moreover, EspR overexpression had little effect on some major EspR-bound genes (), suggesting that EspR-binding does not necessarily affect transcription locally but rather serves as anchoring points to organize chromosome domains. NAPs with DNA-bridging activity, such as EspR, are often located at the boundaries of chromosomal domain loops 
where they control gene expression in a temporal or spatial manner. In many bacteria, NAP expression levels are dependent on the growth phase 
. This is true of Mtb
since low EspR levels were detected at early- and mid-log phase compared to stationary phase, and premature conditional overexpression causes growth to slow down. The interplay between different NAPs alters chromosome structure and organization thereby influencing patterns of gene expression in a temporal manner.
Lsr2 is a DNA-bridging protein that also performs NAP functions in Mtb
by recognizing AT-rich and xenogeneic regions. Binding sites of Lsr2 in Mtb
have been mapped by Gordon et al.
using ChIP-on-chip technology 
. Comparison of the Lsr2 ChIP-on-chip and EspR ChIP-Seq results showed that 77% of the genes in the EspR regulon are also likely recognized by Lsr2 
owing to an extensive overlap between their repertoires (Fig. S2
). For example, all three genes significantly upregulated upon EspR overexpression () also bind Lsr2 and major ChIP-Seq peaks of EspR are located close to Lsr2 binding sites in the ChIP-on-chip enriched regions (Fig. S2
While Lsr2 and EspR are both subject to autoregulation there is no evidence for cross-regulation and Lsr2 seems to impact many more regions. Lsr2 has an N-terminal dimerization domain and a C-terminal DNA-binding domain whereas in EspR the opposite configuration exists. A further difference lies in the DNA recognition mechanisms since Lsr2 interacts with the minor groove while EspR binding is predicted to occur via the major groove. Together, this suggests that EspR and Lsr2 may control gene expression, including that of many cell wall functions, in a divergent manner with EspR possibly replacing Lsr2 at certain sites and vice-versa. It is striking that in all sequenced mycobacterial genomes espR
is very close to the hns
gene encoding another NAP 
and this may also indicate functional interplay. These hypotheses can be tested experimentally by using the corresponding antibodies to perform ChIP-Seq experiments on Mtb
strains at different stages in the growth cycle to localize their binding sites. The contribution of other global regulators that intersect with the EspR regulon, like PhoP and CRP, should also be examined. A regulatory scheme is emerging in which growth of Mtb
, and hence pathogenesis, is controlled by chromosome remodeling, effected by different NAPs, thereby resulting in pleiotropic regulation of gene expression. EspR may thus play a central role in regulating virulence gene expression analagous to that of H-NS in enteropathogenic bacteria