This PBM technology allows rapid, high-throughput characterization of the DNA binding site sequence specificities of TFs in a single day and can associate TFs with the genes they regulate. In addition to identifying enriched functional categories of known and newly discovered target genes, we also identified many uncharacterized ORFs as candidate target genes of Rap1, Abf1, and Mig1. As could be seen for Mig1, PBM experiments will be particularly useful when ChIP-chip does not result in enough enrichment of bound fragments in the immunoprecipitated sample to permit identification of the DNA sites bound
in vivo. ChIP-chip experiments require that the cells be in culture conditions in which the TF of interest is expressed and nuclear. Furthermore, it is possible that the antibody used in ChIP-chip may not be able to detect certain classes of TF DNA binding
in vivo, such as if the primary epitope(s) become inaccessible due to the formation of particular complexes at certain sites. Moreover, integrating an epitope tag on the genomic copy of the TF, which allowed the use of a single antibody in the 106 ChIP-chip experiments performed by Lee
et al.
6, is not as trivial in many other organisms as it is in yeast; instead, protein-specific antibodies that are both specific and successful in chromatin immunoprecipitation are required, and the generation of such antibodies is not a trivial undertaking.
Even though the DNA in PBM experiments is not in the same state as it might be if it were to be bound by the TF
in vivo, results from PBM experiments can provide valuable data on the sequence specificity of TFs, particularly those which have been poorly understood or uncharacterized thus far. Performing ChIP-chip experiments on yeast grown under a variety of different culture conditions will help to confirm our predictions that particular sets of newly identified binding sites are indeed bound
in vivo34. Furthermore, the combination of PBM data with mRNA expression data, ChIP-chip data, protein-protein interaction data, and prior genetic and biochemical data in the literature will contribute towards more detailed models of gene regulatory networks in yeast
35.
It is possible that results from PBM and ChIP-chip experiments will not correspond so closely for all proteins. Such differences may help to identify whether there are significant in vivo effects due to chromatin structure or cofactors important in allowing or preventing sequence-specific binding. In order to look for evidence for such co-regulatory mechanisms, for each TF we searched the sets of intergenic regions bound only in vitro or only in vivo for secondary DNA sequence motifs. We did not find any secondary motifs that achieved statistical significance, potentially because of the many different modes by which binding of TFs to DNA is regulated in vivo. However, it is possible that such secondary motifs might exist for TFs not studied here.
The data presented here indicate that the PBM approach works for TFs with DNA binding domains of a number of different structural classes. PBMs could also be used to study DNA binding proteins important in other biological processes, such as DNA replication, DNA repair, genome rearrangements, or modification of DNA. Since PBM experiments are highly scalable, they could be adapted for the analysis of all possible DNA sequence variants. Similarly, there are hundreds of predicted DNA binding proteins in yeast and thousands of predicted TFs in other genomes that could be screened for sequence-specific binding by PBM experiments. Since dozens of PBM experiments could be performed in parallel in a single day, this technology provides significant cost and time advantages over other methods, which can take months to measure the effects of mutations for a large set of variant DNA-protein interactions.
The effects of different concentrations of TFs, protein cofactors, protein modifications, small molecule cofactors such as metabolites, or various binding conditions could be measured with PBMs. Indeed, it has been shown that
in vitro binding specifically by heterodimeric TFs can be detected with a PBM approach
36. Similarly, PBMs could be used to distinguish the relative binding preferences of various whole or partially fractionated cell lysates, such as from various cell types, sampled at different time points or grown under different conditions.
Bioinformatic analysis of PBMs will provide more informative data than a mononucleotide PWM, as it has been shown previously that nucleotides of TF binding sites frequently do not act independently in binding by TFs
37-39. Moreover, the vast datasets that would be generated on DNA-protein interactions by PBMs could yield the necessary data required to determine what predictive rules may exist that describe DNA recognition by sequence-specific TFs
40.
Finally, only a small handful of sequence-specific TFs have been characterized well enough to know many of the sequences that the TFs can and, just as importantly, cannot bind. Ultimately more complete TF binding site data will permit more accurate prediction of functional
cis regulatory elements within the vast stretches of noncoding sequence in the genomes of both model organisms and the human genome than has been possible thus far
41.