Microarray technology helped to accumulate an immense pool of data on gene expression changes in response to different environmental factors. Yet, a computer-generated gene profiling, although limited by availability of EST collections in databases, represents a valuable alternative to microarrays, which allows efficient discovery of homologous sequences in evolutionarily different species and comparison of gene sets on the whole genome scale. Importantly, computer profiling is especially sensitive to low-abundance transcripts that are normally underrepresented in cellular mRNA, such as transcription factors. Furthermore, the method does not require a significant statistical support because of a large number of available transcripts (tens of thousand EST) and reliable selection of upregulated genes in expression profile. Previously, we successfully used computer-assisted gene profiling for identification of testis-specific genes in Drosophila
]. Thus, an expansion of data obtained by microarrays with the help of EST-based gene profiling as well as comparative analysis of information collected by these two approaches would undoubtedly be beneficial. Indeed, our quantitative comparison of gene profiling data using the Shannon index demonstrated that EST mapping and microarrays can efficiently complement each other.
Application of computer-assisted gene profiling toward study of gene expression induced by plant pathogens was the main goal of this study. In general, our results are in a good agreement with those generated by genome chip technology. As a whole, two independent approaches (microarray and EST profiling) reveal the same classes of functionally related genes with expression levels elevated as a result of defense response. These are many groups of genes involved in general plant metabolism (photosynthesis, protein synthesis, mitochondrial genes) along with the genes of host stress and defense responses [6
]. In the EST-derived profile, genes annotated as defense-related and responsive to abiotic stress represented 20% and 23%, respectively, of all the stress-responsive genes in the Arabidopsis
genome. Other classes found included putative membrane-associated receptors and many transcription factors with unknown regulatory functions.
According to our estimate, contribution of different groups of functionally-related genes to plant defense responses varied significantly. A share of genes associated with photosynthesis, which under the same conditions is a rather constitutive and stable process, was especially high (Table ). This may indicate their role in basal defense responses rather than a rapid switch toward viral needs [9
]. Other researchers also found considerable variation in the degree of involvement of different groups of genes in host responses [6
In addition, computer-generated profile of gene expression changes during pathogen attack uncovered activation of many genes normally induced in response to abiotic factors. Engagement and expression levels of cold stress-responsive genes were the largest. Even though it is hard to suggest their specific role in defense reactions, it is known that plant defense mechanisms may be temperature-dependent. For instance, at low temperatures RNA silencing-mediated antiviral defense is inhibited [10
The experimental system of compatible virus-host interaction between A. thaliana and CMV(Y), which was used in this work to evaluate reliability of computer-generated profiling, is based on the fact that the plant is not equipped with powerful means of genetic resistance such as the presence of R genes and therefore will not develop hypersensitive response or systemic acquired resistance to the pathogen. Observed expression changes in the genes, pre-selected according to the created profiles, demonstrated high stringency of EST profiling even when applied to a random model of host-pathogen interaction. Activation of transcription factors, genes of general metabolism and stress genes may correspond to the initial, basal response to the pathogen, which is eventually overtaken by infection.
Profiles of defense-related genes, obtained by mapping of heterologous EST, represent putative Arabidopsis
homologs of the corresponding species. Comparison of these profiles in pairs and locating common genes allowed indirect estimation of the similarity between defense-related gene sets of different plant species based on Arabidopsis
homologs. This similarity was high among three dicotyledonous species and rather low in monocotyledonous T. aestivum
vs three dicots. This suggests that the repertoire of genes participating in defense reactions in dicots and monocots, although similar to some degree, has nevertheless considerably diverged. It is, apparently, generally conserved in dicots. Since Arabidopsis
) and Glycine max
) are in the Rosids and both Lycopersicum esculentum
and Solanum tuberosum
are in the Asterids, it appears that evolutionary conservation of defense responses in these two groups may be traced to their common ancestor as far as 150 million years ago [11
]. Differences in defense mechanisms between dicots and monocots could in theory date back to their split ~200 million years ago [11
In summary, we have demonstrated that computer-assisted gene profiling based on heterologous EST mapping may be efficiently used to reveal genes involved in host defense responses to pathogens. Unlike microarrays, it permitted indirect comparison of the complete sets of functionally related genes in different species on the whole genome scale. The method allows effective identification of tissue-specific and organ-specific genes, genes expressed at a particular developmental stage, and genes responsive to internal and external stimuli. Moreover, EST profiling permits to further narrow down these groups to individual biological processes, cellular compartments, or molecular functions. For instance, in this work, EST mining identified a large group of defense-related genes encoding calcium-binding proteins. Since calcium is an important second messenger, playing an essential role in plant defence responses [13
], these data may eventually be applied to identify a role of Ca2+
-binding proteins in specific plant-pathogen interactions.
For the quantitative representation of expression level in profile, only EST derived from non-normalized cDNA libraries can be used, even though low-level expression genes in such libraries are underrepresented. On the contrary, in normalized libraries where specific genes are better represented, their EST cannot be used for quantitative analysis of expression level.
Presently, EST mining can easily be adopted to reveal specific genes in Arabidopsis, Drosophila, and mouse genomes and, to a lesser degree, in the human genome. Databases such as Genbank are constantly supplied with new EST and annotated genome sequences. Soon, vast datasets will be available from deep RNA sequencing on next generation platforms. Because of this, we believe that the approach conveyed in this study may be successfully applied to further increase our knowledge and understanding of transcriptom dynamics in general and more specifically, its role in host-pathogen interactions and plant defense mechanisms.