Ground-truth dataset: ChIP-Seq assays were performed for the TFs ATF3, C/EBPδ, IRF1, NFκB/p50 and NFκB/p65 in macrophages activated through treatment with purified Toll-like receptor agonists for 1–6 h (see
Supplementary Table S2 and Section S1.4). Binding locations were identified from above-threshold locations in the ChIP-Seq signal, as described in
Supplementary Section S1.7.
Prediction features: TF predictions were made in 100 bp intervals (as used in Won
et al.,
2010) of transcript-proximal regions comprising ~7% of the genome, selected as described in
Supplementary Section S1.2. Combinations of eight features, individually listed in
Supplementary Table S1 and labeled by index
f, were used for TF binding prediction. Feature
f = 1, which conferred TF specificity to the predictions, was based on motif scanning. For each TF, motif position-weight matrices (PWMs) corresponding to the TF were obtained from TRANSFAC (
Supplementary Table S2 and Section S1.3). Sequences were scanned for motif matches using a likelihood-based algorithm (Lähdesmäki
et al.,
2008), and combined to obtain, within each interval and for each TF, a score representing the strength of the best match for any motif corresponding to that TF, at any position within the interval. Features 2–5 of
Supplementary Table S1 were derived from HAc ChIP-Seq assays of unstimulated macrophages or macrophages stimulated for 1, 4 or 6 h with LPS (
Supplementary Sections S1.4–1.5). VS for HAc local minima were computed as described in
Supplementary Section S1.6. Features 6–8 were based on genomic sequence, and thus are not macrophage specific. For the stimulated-cell HAc ChIP-Seq features (
Supplementary Table S1, rows 2 and 4), the time point for the HAc dataset that was used was always the same as the time point of the ground-truth dataset for the TF for which predictions were being made.
Prediction model: within each interval
i, the model integrates a set
F of up to three features (always including the motif feature,
f = 1) by a weighted sum of thresholded feature values. Feature values may depend on the TF
t, as is the case for motif scanning, or on the cellular condition for which TF binding predictions are being made (as is the case for HAc-derived features). The value for feature
f at interval
i and TF
t is therefore denoted by
vfit. The feature value
vfit is passed through a piecewise-linear function θ
f that is defined by feature-specific thresholds λ
f and μ
f,
The prediction score σ
it that the TF
t binds within interval
i is obtained by a weighted sum of thresholded contributions, but with a multiplicative factor enforcing a minimum TF-specific motif match value for a non-zero σ
it,
where the weight vector

has unit L1 norm (a negative component would represent a feature that is anti-correlated with TF binding), and where θ is defined by θ(
x) = 0 if
x ≤ 0 and θ(
x) = 1 if
x > 0. Importantly, a given model instance

, defined by the tuple

, is TF independent.
Performance metric: for a given model

, TF
t, and prediction score cutoff σ, the set of intervals Π(σ,
t) for which σ
it ≥ σ were predicted to contain binding sites for
t (remaining intervals were predicted to have no
t binding). The set of intervals containing ground-truth binding sites (based on ChIP-Seq) is denoted by Σ(
t). Because the typical ChIP-Seq fragment size was ~160 bp, some TF binding locations appeared as adjacent intervals in Σ(
t); these were counted as single binding sites. The number of ground-truth binding sites
B(
t) was counted (
Supplementary Table S2), and the fraction of these binding sites that coincided with at least one interval
i ![[set membership]](/corehtml/pmc/pmcents/x2208.gif)
Π(σ,
t), was computed as the sensitivity
S(σ,
t). The FPR
E(σ,
t) was computed by dividing the number of intervals in the set difference Π(σ,
t)\Σ(
t) by the number of intervals not contained in Σ(
t). The cutoff σ was varied and the resulting (
E(σ,
t),
S(σ,
t)) function [receiver operating characteristic (ROC) curve] was numerically integrated over the range 0 <
E ≤ 0.01 to obtain the TF-specific performance score
A(
t). For model training (
Supplementary Section S1.12), the cost function used was
C(
t) = 1 −
A(
t)/0.01. During training, cases where it was not possible to obtain a sufficient number of (
S,
E) samples were handled using a penalty, as described in
Supplementary Section S1.11.
Model training: groups of four TFs at a time were selected for model training, and for a given model

, the cost was averaged over the four TFs,
C =
C(
t)
t. Model parameters were varied to minimize
C subject to constraints on

,

and

, using a two-stage optimization process (
Supplementary Section S1.12), to obtain the best parameter set for the model with features
F.
Model testing: for both training and testing purposes, the performance
A(
t′) of the model with the best parameter set from the training, was measured on the fifth TF
t′ using leave-one-out cross-validation. The five values for
A(
t′) were compared between different feature groups
F using a paired
t-test, and summarized in terms of the mean and SD (
Supplementary Table S3).