Scaling ISH approaches to gene set or exome-wide analyses requires pipelines for probe synthesis, hybridization, analysis and annotation. Whereas much ground work has been performed in the framework of large mouse transcriptome studies on frozen tissues, modifications are necessary for successful and reproducible ISH in FFPE materials. For facile probe synthesis, we conclude that careful informatics combined with the use of a pooled human cDNA library (MegaMan) entails a high first-pass success rate in probe template generation. This cDNA library encompasses mRNA from 32 human tissues and 34 human cancer cell lines ensures a good representation of transcripts with diverse expression levels and splice variants. In parallel projects, this approach has successfully been used to generate a total of 700 probes with a first-pass success rate of >95% (data not shown).Essential factors for successful staining that differ from previously published procedures employing frozen tissues were the use of freshly sectioned tissue, since weak or no staining was observed for older FFPE tissue sections possibly due to oxidation of RNA on the exposed surface or unsatisfactory fixation conditions affecting RNA quality, a higher concentration of proteinase K, and two cycles of tyramide biotin-based signal amplification. Although the latter may result in slightly higher background levels, it is in our experience a convenient way to increase signal intensity.
The specificity and sensitivity of ISH was thoroughly evaluated by staining of consecutive sections from the same TMAs encompassing ~40 normal tissue types in the human body. The ISH and IHC staining patterns of KRT17
were identical across tissue types, a selection of representative samples are shown in and Figure S3
, thus providing strong validation of the ISH approach. This high degree of concordance is in agreement with previous observations from mouse transcriptome projects. However, a much needed improvement to enable efficient data mining of ISH and IHC is a standardized ontology for annotation of human tissues, in analogy to the EMAP mouse anatomy ontology 
Antibody sensitivity and specificity is a major concern in immunohistochemistry, especially in large scale projects where numerous antibodies are being produced towards targets with unknown expression patterns. Specificity validation of antibodies to diagnostic grade may be performed by obtaining highly similar staining pattern with an antibody raised to a different epitope of the same antigen 
, loss of signal in knock-out mice or other model organism, or highly similar staining pattern with in situ
hybridization. We here demonstrate the feasibility of the latter approach at organism scale in the human, as TMAs encompassing virtually all normal tissues can be analyzed in one single experiment. We were able to verify the specificity of novel antibodies to BRD1
, with two independent RNA probe pairs, confirming that ISH can provide independent specificity validation. A selection of representative samples is seen in and Figure S4
. In parallel experiments, we were unable to confirm specificity of the novel antibodies to FAM174B
). The semi-correlation between mRNA and protein expression for GAD1
can be due to either antibody specificity or differences in transcriptional and translational processes. The lack of correlation between mRNA and protein expression for LYN
can be due to biological and technical reasons. The antibodies for LYN
are targeting all known isoforms of the proteins and the RNA probe pairs are located in regions which are present in all known transcripts. Also for PTPRC, two independent RNA probe pairs were used, demonstrating lack of ISH staining between inter TMA replicates. We therefore believe that the lack of correlation between mRNA and protein expression is most likely due to biological reasons, such as post-transcriptional, translational and post-translational modifications affecting the levels of mRNA and protein. Also, mRNA decay and protein half-life could have an impact on mRNA and protein correlation. Other sources of discrepancy between IHC and ISH could be lack of sensitivity or specificity in either one of the approaches; this enables interpretation of concordant staining patterns as supportive of antibody specificity and discordant patterns as lack of specificity. In large scale efforts, antibodies with concordant ISH staining patterns should be prioritized for further analyses as opposed to antibodies with discordant or lacking ISH staining.
, a nuclear matrix-associated transcription factor and a member of the family of special AT-rich binding proteins, has recently been shown to be expressed in normal cells of the lower gastrointestinal tract and in cancer cells of colorectal origin. Due to the highly specific nuclear expression pattern in normal and malignant cells of the gastrointestinal tract, SATB2 protein expression was suggested as a clinically useful diagnostic biomarker for colorectal cancers, the third most commonly diagnosed cancer in the world 
. Cohen's Kappa test was performed to determine the agreement between IHC and ISH staining, demonstrating good inter-rater reliability. The scalable technology presented here also enables gene set expression analyses in human tissue arrays. We envision that the tyrosine kinome, tyrosine phosphatome, other genes in cancer pathways, along with a multitude of novel candidate cancer genes derived from exome and genome sequencing constitute prime targets for large scale in situ
hybridization based characterization. We propose that comprehensive and systematic mapping of the expression in situ
in normal and malignant human tissues at cell type resolution will provide valuable knowledge, especially in cancer drug development as the results can aid in predicting effects, guide repositioning of available targeted drugs, and help predict response to drugs that inhibit several different molecular targets. It also provides the technical foundation for transcript expression mapping of human protein-encoding genes, and potentially also other RNA species, in systematic whole-genome approaches.