In this study we define a subset of AP2/ERF proteins known as CRF or Cytokinin Response Factor proteins after the six members of this group from Arabidopsis (CRF1-6). CRF proteins can be characterized by the presence of two domains, a specific variant of AP2/ERF DNA binding domain near the middle of the protein and a novel domain at the N-terminal end that is unique to CRF genes. Additionally, in many CRF proteins there is also a putative kinase phosphorylation motif in the C-terminal half of the sequence.
The novel CRF domain is present in all CRF proteins, always found in the N-terminal region and always paired with a distinct AP2 DNA binding domain sequence. Identification of CRF proteins can be made using either of the AP2 or CRF domain alone. Previous phylogenetic studies of ERF proteins examining just the AP2 domain do return a cluster of proteins that possess the CRF domain. Interestingly, while AP2 domains are found in a wide range of plants and even some bacteria without a CRF domain, the CRF domain is never found in any protein without an AP2 domain [3
]. Proteins that contain the CRF domain make up about 10% of Arabidopsis ERF proteins (12/122), 6.5% of rice ERF proteins (9/137), and 6.5% of Populus (11/168) [2
]. Additional estimates of the CRF domain abundance in other plant species are difficult to make without a large scale study of ERF proteins or a sequenced genome, but numbers appear to reside in a similar range of 5-10% of ERF proteins.
The CRF proteins appear to be unique to land plants. Despite the availability of several fully sequenced green algal genomes, CRF domain containing genes could not be identified in this lineage. In searching for such, we assumed that moss and bryophyte CRF domain sequences would make the best BLAST queries, although even those may have not been similar enough to any algal CRF domains to get a hit. AP2 domain containing genes are present in green algal genomes but examination of several of these [15
] did not reveal any ERF type genes in these organisms.
Flowering plant CRF proteins roughly fall into two large clades, A and B, according to distance and parsimony analyses (Figure ). That diverse species of flowering plants i.e. both monocots and eudicots, occur in each of these clades suggests a relatively early divergence of the two CRF lineages.
Clade A contains approximately twice as many CRF genes for numerous plant species as the B clade. Previous examinations of then unknown CRF proteins from larger scale analyses of the entire ERF protein family, have hinted at such a discrepancy for some species but lacked resolution of clade A and B members [2
]. Our analysis of a wide range of species strongly suggests that B clade CRFs, while slightly divergent from A clade members, clearly belong to the CRF group and not any other ERF sub-group.
An additional motif found in a number of CRF proteins is the amino acid sequence SP [T/V]SVL in the C-terminal end of the protein downstream of the AP2 domain. Although this motif is found in all of the previously described Arabidopsis CRF genes, similar genes from rice (subgroup VI), and both Tomato Pti6 and Tsi1 it is not ubiquitous among CRF proteins occurring in roughly half of the CRFs identified here for which full length sequence is available [8
]. Additionally, in contrast to the CRF domain, a SP [T/V]SVL motif is known to occur in a number of proteins outside of the AP2/ERF family, including Phosphatididylinositol transferases, Universal stress proteins, LRR/extensions, and Myb and Zinc Finger transcription factors. There are 33 non-CRF proteins that fall into this category in Arabidopsis alone. Interestingly, several of the members of this group have also been shown to have effects on leaf development like the CRF genes, including Early Phytochrome Responsive1 (EPR1), Longifolia (LNG1 and LNG2) and Growth Regulating Factor3 (GRF3) (Additional File 2
). While the effects of leaf development vary among this group it is revealing that nearly 30% of Arabidopsis proteins with this motif can be linked to leaf development. Perhaps more significant is the apparent link between this motif and regulation by the hormone cytokinin, as nearly half of the non-CRF genes (14 of 30) that contain this motif and have been examined on Affymetrix microarray experiments, show altered transcription levels by treatment with cytokinin or in cytokinin mutants (Additional File 2
). It is possible that the SP [T/V]SVL motif functions as a putative MAP kinase and/or casein kinase 1 phosphorylation site, as part of this motif has previously been described as such [8
]. Such a function may serve to link CRFs to cytokinin regulation as phosphorylation is an essential part of other members of the cytokinin signaling pathway [16
It is currently unclear as to the exact function of CRF proteins. Only eight have been experimentally examined prior to this report: CRF1-6 from Arabidopsis, PTI6 from Tomato, and TSI1 from Tobacco. While all CRFs possess an AP2/ERF binding domain and are most closely related to Ethylene Response Factor (ERF) proteins directly involved in ethylene response, there is no evidence that CRFs are linked to ethylene save the putative ability of their AP2 domain to bind to the ethylene response element, GCCGCC. None of the Arabidopsis CRF genes show any transcriptional change in response to ethylene or in ethylene mutant backgrounds in microarray experiments. Additionally, CRF1-6 mutants bear little resemblance to classic ethylene mutant phenotypes and show no sign of variation in ethylene levels, even in analysis of triple mutant knockout lines [[17
], Rashotte and Kieber, unpublished result].
One potential role of CRF genes could be in cytokinin regulation as all six Arabidopsis CRF proteins examined have been shown to be regulated by cytokinin in terms of their intracellular, particulary nuclear localization - presumably the site of action for these transcription factors [10
]. However, only three of these six CRF genes, CRF2, CRF5, and CRF6 are regulated by cytokinin transcriptionally and the other three do not show any transcriptional regulation by cytokinin as examined in microarray experiments [10
]. Future study at the protein level has yet to determine if the other Arabidopsis CRF proteins are similarly cytokinin regulated, and none of the newly discovered CRF proteins identified in other plant species here have been examined in this manner to date.
Only one study prior to this one has examined cytokinin regulation at the transcriptional level outside of Arabidopsis, Hirose et al., 2007 [20
]. This study in rice used microarrays to examine global expression patterns of genes regulated by cytokinin application and in a cytokinin response regulator overexpressing plant. While this work did identify several highly related ERF family genes that are induced by cytokinin similar to the CRFs, and supports the general role of ERFs in cytokinin regulated processes, the induced ERF genes in rice do not contain either a CRF domain or a SP(T/V)SVL motif in their protein sequences [20
]. The cytokinin induced ERF genes are related to CRFs, but lack CRF domains and have been placed by sequence analyses into different subgroups after careful examination (group B-3 or VII vs. group B-5 or VI [2
]). Further examination of the other members of this group and of just the CRF domain alone are needed to determine if this domain is specifically involved in cytokinin regulation of these proteins. Despite the current lack of evidence it is an attractive hypothesis that the CRF domain is somehow involved in cytokinin regulation.
Mutational analysis of CRF genes is likewise limited to studies in Arabidopsis although a RNAi knockout of Tsi1 was generated in tobacco [10
]. The study of mutants in CRF1 to CRF6 genes has indicated a possible role in cotyledon, leaf, and embryo development in addition to their link to cytokinin [10
]. There are no specific reports of additional phenotype alterations in any of the other Arabidopsis genes containing a CRF domain or in any other species to date, although this is may be due to either a lack of mutants, lack of study, or potentially redundant nature of CRF genes as seen in Arabidopsis.
In order to further examine if cytokinin does play role in the regulation of newly 'discovered' CRF proteins outside of Arabidopsis we examined four CRF genes identified in this study from Tomato. We chose these Tomato genes as it allowed for the examination of both a known CRF protein, PTI6 that we also designate SlCRF1, and three novel proteins, SlCRF3-5, none of which had previously been examined for cytokinin or any other response. An analysis of SlCRF transcripts in Tomato leaves in the presence and absence of cytokinin treatment shows that all four SlCRFs are induced by cytokinin to varying degrees (Figure ). This significant result shows that SlCRFs are truly regulated by cytokinin, at least at the transcriptional level, and suggests that CRF proteins may generally play a role in cytokinin regulation. It is interesting that all the SlCRFs show transcriptional cytokinin regulation, when not all Arabidopsis CRFs are known to be transcriptionally regulated, especially since all CRF domain containing genes in rice appear to not show this regulation as seen in one microarray examination [10
Additional putative functions for CRF proteins including links to pathogen response and salt stress come from the two other CRF proteins that have been previously examined in some detail for function, Pti6 from Tomato and Tsi1 from Tobacco [12
]. Pti proteins Pti4/5/6 are known to interact with Pto Kinase that is directly linked to pathogen response, as this is how they were originally identified [12
]. All three of these Pti proteins are also ERF proteins, yet only Pti6 can be classified as a CRF protein. While it is unclear how these Pti proteins act with Pto kinase in pathogen response, it appears that the CRF domain is not essential of that interaction. Not only do both Pti4 and 5 lack that domain, but an examination of the yeast-two hybrid analyses to determine what parts of the Pti proteins are required to interact with the Pto kinase showed that the initial 48 amino acids of Pti6, containing at least half of its CRF domain, was not necessary [12
]. Tsi1 has been linked by transcript induction to high salt stress and bacterial pathogen resistance in 35S overexpressing Tsi1 transgenic plants and to similar stresses in Tsi1:RNAi plants [13
]. Interestingly, Park et al., 2001 also found that the AP2 domain of Tsi1 was able to bind both the GCC box involved in ethylene response and also the CBF/DREB cis-element involved in drought stress response. This would suggest the involvement of the AP2 domain in the salt stress response of Tsi1. Together these analyses of Pti6 and Tsi1 clearly indicate a role of these proteins in pathogen response. This suggests that pathogen response may be a larger function for the CRF group of proteins, although there is little evidence to suggest any CRF proteins other than Pti6 and Tsi1 are involved in pathogen responses as evidenced from the lack of response in several pathogen response microarray analyses of Arabidopsis CRF genes. Response to salt stress could also a part of CRF protein function as Tsi1 has been shown to function in that area and there is some microarray data that a few Arabidopsis CRF genes could be involved, but little is known beyond that. A more detailed analysis of pathogen and salt stress responses will have to be made before any clear function can be ascribed to this group of proteins or genes, but as members the members of this group are now defined it should be easier to compare and compile these and other potential functions in the future.