Microarrays are expensive. Depending on the platform, a single experiment can cost hundreds if not thousands of dollars, and there is currently no consensus about how many replicates need to be done, although the number, fortunately, appears to be low
46, 47. The cost of the arrays is independent of the cost of the reagents (the Cy3 and Cy5 fluorescent dyes coupled to triphosphates are expensive, as are the kits for biotin, single dye labeling, and the biotin labeling protocols are tedious and time consuming). The array scanner and workstation (though a one-time purchase and often shared via a core facility), the effort of a technician and an analyst, and of course the intangible cost associated with the time spent doing the experiment in the first place, must all be added to the cost of microarray experimentation.
does not take into consideration an extremely important point, often overlooked, in microarray experiments: that is, how many times can the array be stripped and reused. Combimatrix and Nimblegen offer protocols and advice to those wanting to reuse their arrays, but as far as most companies and investigators are concerned, arrays are single-use reagents. Many home-brewed protocols describe stripping procedures for other arrays, but not everyone agrees that these techniques yield an array that is as unbiased as one that has never been used.
Probe length is still a controversial subject, with some claiming that shorter probes are more specific
48 and others preferring the longer probes because of their sensitivity. It is clear, at least, that the probe length should be tailored to the application of the array
49, 50. Depending on the manufacturer and the manufacturing protocol, longer probes may or may not cost more than shorter probes, and the manufacturing technique ultimately determines the maximum probe length possible, as most processes suffer from decreasing accuracy and low yield at the limits of probe length.
Good sequence information is critical to the success of a microarray; each probe needs to be as complementary as possible to its target sequence — variations from perfect complementarity can be tolerated, and the tolerance will depend on hybridization and washing conditions. In fact, some groups have tried to quantify acceptable deviation from perfect complementarity
51, 52. It is certainly possible to detect single base changes; SNP chips, built just for this purpose, have been quite successful. Recently, a method for whole yeast genome sequence mapping of sequence polymorphisms using arrays has been outlined
53. While it is clear that perfect complementarity is important, nobody has established guidelines for planning and analyzing real-life experiments that will necessarily include imperfect hybridization (due to both polymorphism and error).
Still debated in the world of microarrays is what to do with intra- and inter-platform variability, and whether experiments can be compared across platforms. At the very least, the amount of data generated creates a formidable informatics challenge
54. While spotting or in situ synthesis techniques and probe design continue to improve, there seem to be many significant sources of variability within the technology, ranging from the specific batch of manufactured arrays and reagents
47, to feature shapes and binding affinities
55, to the particular settings of the scanner, or, perhaps most importantly, the operator or technician processing the samples even within the same laboratory (with the same protocols and reagents). Other sources of variability are the laser detection systems themselves.
There have been a few encouraging studies examining variability among microarrays done on different platforms and in different labs, but questions still remain, as the two largest such investigations
56, 57 apparently used the same technicians in the multiple labs to perform the same experiments on multiple arrays. This overly stringent design leaves a major question open, as in reality, arrays are performed by many different people at many levels of training, and, like any other experimental technique, are subject to the expected variations in protocol that occur when weather, social or family schedules, and lack of sleep interfere.
Therefore, each array must be normalized and arrays in a set must be analyzed together; there are a number of normalization strategies and numerous analytical techniques in use today; an excellent synopsis of statistical techniques needed for microarray analysis can be found
58-61.
Finally, as more microarray data is published, more data and experimental design integrity studies are performed, some of which are challenging the validity and reproducibility of microarray-based clinical research. Recent articles describe common mistakes and basic flaws in the experimental design of studies published in peer-reviewed journals, that question the reliability of the outcome
62.
Rising to the challenge, biostatisticians are developing sophisticated analysis techniques to control artifacts and bias while extracting as much information as possible from array data. Making sense of the results from a biological or medical perspective is equally as challenging, as the quantity of data necessitates automated analyses, often introducing yet another layer of communication (though rewarded by a new perspective) when a programmer is added to the team.