In this study a characteristic transcript set was determined which is specific for the colorectal dysplasia-carcinoma transition using whole genomic microarray in 53 biopsy samples. In order to test the differentiation power of the discriminatory gene panel, an additional 94 microarrays with independent colonic biopsy specimen and microarray datasets downloaded from the Gene Expression Omnibus were also analyzed. With further validation conducted by array real-time PCR cards that contained the characteristic transcript panel. The identified set of 11 transcripts can be used for separation of CRC, adenoma and normal biopsy samples, moreover it is suitable for discrimination between high-grade dysplastic adenoma and early stage CRC cases by high specificity and sensitivity.
The use of whole genomic microarray analyses represents an important tool for high-throughput gene expression screening, but equipment and reagent costs do not qualify it as for a cost effective diagnostic tool. Therefore quantitative array real-time PCR cards with assays for selected set of classifiers offer a more viable alternative for diagnostic application with lower costs and automation possibility for the whole process from RNA isolation to the RT-PCR analysis 
The current method of determining colorectal cancers and adenomas is histological analysis. Colon biopsy specimens are evaluated from 4–5 pieces of small sections of 3–5 µm thick taken from different areas of the colon. However critical areas may remain hidden in the uncut specimen block or due to inadequate orientation including aberrant crypt foci in hyperplastic polyps, in situ carcinoma in adenomas, dysplastic areas and carcinomas in long-time IBD specimens 
. In this study, whole biopsy specimens containing mixed cell populations were applied for mRNA expression microarray and real-time PCR analysis in order to overcome the potential sampling errors of conventional histological analysis. Though histological laser microdissection can provide accurate cell type specific information, its major limitation is the need of a very skilled operator, which does not support it to be a candidate diagnostic tool 
Further to this, pathologists recently have to face growing workload due to the increasing demand on cancer screening biopsies, molecular testing for target therapy and the concomitant sub-specialization. Therefore, an alternative but still reliable method for identifying diseased or negative specimens could be of great importance. The automated evaluation of colon biopsy specimens by mRNA expression profiling could be a valid approach since much of the methodology, preparation and the analysis procedure are already available.
Furthermore, the mRNA expression analysis gives us an insight into altered cellular functions beyond the microscopic level. This information might be related to the biological behaviour of tumors and/or the expression of therapeutic targets, e.g. growth factor receptors. Also the expression of metastasis related genes and those involved in tumor invasiveness may be identified.
The set of 11 classifiers determined in our study showed considerably high discriminatory power on the microarray datafiles of previous studies in CRC vs. normal and in adenoma vs. normal comparisons. In silico results suggest that the identified transcript panel can be used as general discriminative markers for colorectal cancer and polyps. Only datasets with CRC and normal, respectively adenoma and normal biopsy samples can be downloaded from Gene Expression Omnibus database which applied Affymetrix HGU133 Plus 2.0. microarray system. To our knowledge, this study is the first whole genomic oligonucleotide microarray study containing CRC, adenoma and normal biopsy samples together available in GEO which can be suitable for the identification of discriminatory transcripts even between early stage CRC and high-grade dysplastic adenoma tissues. The common pre-processing of the data files from different studies resulted in a clear separation of not only diseased and normal samples, but of adenoma and CRC samples as well. However, the datasets of the different studies are difficult to handle together as the differences of sample preparation can distort the results: this case can cause the overestimation of the efficacy of adenoma and CRC discrimination.
Among the 11 discriminatory transcripts, except COL12A1, ten (namely IL8, MMP3, IL1B, CHI3L1, GREM1, IL1RN, CXCL1, CXCL2, CA7 and SLC7A5) are thought to be associated with colorectal carcinogenesis and progression. In accordance with our findings, 7 of them, such as IL8, CHI3L1, CXCL1, CXCL2, MMP3, SLC7A5 and CA7, were found to be differentially expressed in CRC compared to normal tissue in previous microarray studies 
. CA7 
was also found to be downregulated not only in carcinoma, but in adenoma samples.
Interleukin 8 (IL8) promotes cell proliferation and migration of human colon carcinoma cells through metalloproteinase-cleavage proHB-EGF 
. The expression of SLC7A5 cationic amino acid transporter was also found to be significantly associated with cell proliferation and angiogenesis 
, moreover it seems to play an important role in enhancing the tumor growth in vivo 
. The secreted interleukin-like Gro-alpha oncogene (CXCL1) and matrix-metalloproteinase 3 (MMP3) promote tumor initiation and growth (21–22), while chitinase 3 like-1 (CHI3L1) can protect cancer or/and stromal cells against apoptosis 
. Elevated expression of interleukin 1 beta (IL1B) mRNA increases the risk of non-small cell lung cancer 
. Although, it is known that IL1B polymorphisms are associated with tumor recurrence in stage II colon cancers 
, the function of this gene has not been clarified in CRC. Gremlin 1 (GREM1) as an antagonist of bone morphogenic proteins, has been shown to regulate early development and tumorigenesis. It was overexpressed in various human tumors and plays an oncogenic role especially in carcinomas including CRC 
. In previous studies, a highly significant upregulation of CXCL2 chemokine was found in CRC compared to normal colonic mucosa which could be already detected also in benign adenoma referring to the involvement of CXCL2 in the dysplasia-carcinoma transition 
In summary, this study identified a set of 11 discriminatory transcripts which could correctly classify not just normal, adenoma and CRC biopsies, but high-grade dysplastic adenoma and early stage CRC samples, even if using a large independent sample set. Although 10 of the 11 discriminatory genes are already known to be associated with CRC, these markers as a combined discriminative set are firstly applied in this study. The identified set of 11 markers was proved to be a highly specific and sensitive discriminator of the colorectal dysplasia-carcinoma transition which is of great clinical importance regarding the early diagnosis of CRC. These markers can establish the basis of gene expression based diagnostic classification of benign and malignant colorectal diseases and of development of diagnostic real-time PCR cards, furthermore they are to be utilized for prospective biopsy screening both at mRNA and protein levels.