Roughly speaking, ChIP-seq has three key steps that determine its success. The first and most crucial is antibody selection; the second is the actual sequencing, which is subject to several possible biases; and the third is the algorithmic analysis, including mapping and peak-calling.
The first requirement, obviously, is that the antibody has some specificity for the protein under study: this can be tested using a panel of recombinant proteins or cell lines transfected with different protein targets. Then, the antibody must be able to immunoprecipitate the target protein. Not all antibodies immunoprecipitate, and even when they do, they may not do well in ChIP. Ideally, earlier studies will have identified genomic sites where the protein is known to bind, and these sites can be used to optimize the ChIP conditions.
The second issue is sequencing, which is a 'black box' for many biologists, who are familiar with what goes in and what comes out, but perhaps not with the possible biases introduced in between. Next-generation sequencing approaches require bulk processing of DNA fragments and massively parallel sequencing. This means that even the slightest bias in the ligation of linkers, in PCR amplification, or in hybridization might result in some platform-dependent biases in the population data emerging from 10 million or more reads. The technologies are still evolving and the different formats have different biases. For this reason, it is important in a ChIP-seq experiment to run a control using 'input DNA' (non-ChIP genomic DNA) so that sequencing biases can be identified and adjusted for.
The third issue is mapping, which with short tags (around 25 to 35 bp) can be ambiguous in regions of high homology or in repeat regions. As the tag sequences get longer, this is less of a problem, but base calling and sequencing errors then limit the mappability. It is not uncommon to have only 50% of the reads mappable, though with more 'intelligent' mapping algorithms that take into account sequencing errors or polymorphisms, mappability has increased significantly. In ChIP-seq, the density of mapped sequence tags is a prime determinant of success. Illumina's ELAND algorithm and the MAQ (Mapping and Assembly with Quality) used to be the short-read mappers of choice, but a new generation of more efficient programs such as Bowtie, BWA (Burrows-Wheeler Alignment Tool) and BFAST (Blat-like Fast Accurate Search Tool) are gradually superseding them.