Searching for
CcPAL1 paralogs in
C. canephora successfully led to the isolation of two new coffee
PAL genes,
CcPAL2 and
CcPAL3, which differed by the encoded proteins and their location on the
C. canephora genetic map (Lefebvre-Pautigny et al.
2010). In addition, a complementary bioinformatic analysis aimed at establishing the number of
PAL genes expressed in the coffee plant was conducted to screen the publicly available
C. canephora EST databases (Lin et al.
2005;
http://solgenomics.net/). This study detected the same three genes, but found no other potential
PAL gene. The resulting mapping data obtained for the three genes will be integrated into a study aiming to identify the quantitative trait loci (QTL) affecting agronomically important traits in coffee beans. If these genes are found to co-localize with the identified QTL, they will likely be used as candidates for improving coffee plants through marker-assisted selection (MAS) (Srinivas et al.
2009; Mohan et al.
2010). The presence of these three expressed
PAL paralogs in
C. canephora is consistent with results already obtained for other plant species. Indeed,
PAL genes generally belong to a multigene family whose number of genes varies depending on the plant considered. For example,
A. thaliana contains four
PAL genes:
AtPAL1 →
AtPAL4 (Raes et al.
2003). But some plant species, such as the potato or particularly the tomato, contain several decades of
PAL genes among which many are inactive (Joos and Hahlbrock
1992; Chang et al.
2008). It would thus be interesting to determine whether such transcriptionally inactive
PAL genes could also exist in coffee, which would not be surprising, given the fact that coffee and tomato share common gene repertories (Lin et al.
2005). The forthcoming genome sequence of
C. canephora will certainly help answer that question. Nevertheless, even when degenerated primers were used to amplify PAL sequences, only the same three transcriptionally active
PAL genes described here and in a previous study (Mahesh et al.
2006b) were amplified.
A previous study suggested that
CcPAL1 may be involved in the differential accumulation of the main groups of CGA found in green coffee beans, the caffeoylquinic acids (Mahesh et al.
2006b). However, the quantitative gene expression results obtained in the present work using Taqman q-RT-PCR technology, while consistent with the involvement of
CcPAL1 in CGA accumulation in coffee beans, also strongly suggest that
CcPAL2 and
CcPAL3 are likewise involved in this process. Indeed,
PAL gene expression analysis in various coffee tissues showed that two or three coffee
PAL genes are often expressed in a single tissue or organ, but not necessarily at the same stage of development. In the bean, for example,
CcPAL1 and
CcPAL3 were found to be highly expressed at the immature stage, while the
CcPAL2 expression was extremely low. Yet, later in maturation (from LG stage),
CcPAL1 and
CcPAL3 transcript levels had fallen significantly and the
CcPAL2 transcript levels had risen slightly. The high expression of
CcPAL1 and
CcPAL3 at the small green stage correlates with the high co-expression of
HQT (
hydroxycinnamoyl-
CoA quinate hydroxycinnamoyl transferase), a gene also involved in CGA biosynthesis (Niggeweg et al.
2004). This observation suggests that the concomitant high expression of these genes leads to a high production of CGA during the SG stage (Lepelley et al.
2007). In addition, the
CcPALs and
CcHQT transcript levels tend to fall as the quantity of CGA drops in the later stages of bean development, suggesting that the genes are co-regulated.
The fact that all three
CcPAL transcripts were detected in many tissues or organs, but at quite varying levels, suggests that the corresponding enzymes probably play particular roles in different parts of the plant while simultaneously playing smaller overlapping housekeeping roles. While
CcPAL1 and
CcPAL3 appear to be more strongly linked with a high accumulation of CGA in coffee bean, it appears that
CcPAL2 may contribute more significantly to flavonoid accumulation. Evidence for this proposal comes from the substantial level of expression of
CcPAL2 in the flower, an organ known to have relatively important rate of flavonoid synthesis. The fact that
CcPAL1 transcription level was also high suggests that this gene could also contribute to the flavonoid precursor pool in this organ. Finally, it is also important to note that the three coffee
PAL genes are expressed when the pericarp of the coffee cherry is red, suggesting that they may participate in fruit coloration. Searching for their potential co-expression with others genes, either from the general phenylpropanoid pathway (i.e.,
C4H and
4CL), or from the specific flavonoid (i.e.,
CHS and
CHI) or lignin (i.e.,
COMT and
CAD) branches, could help to determine more precisely whether a coffee
PAL could be more specifically involved in one of the two major phenylpropanoid branches. Such an approach was used by Gachon et al. (
2005), who showed that in
Arabidopsis the paralogs of the phenylpropanoid pathway displayed a clear differential co-expression according to the culture conditions applied to the plant and its response. In the same manner, Mahroug et al. (
2006) linked the specific co-expression observed for three genes (
PAL,
C4H and
CHS) with the high flavonoid content found in the upper epidermis of
C. roseus.
The three coffee PAL-deduced protein sequences were compared to 98 PAL protein sequences from other species. The phylogenetic tree helped to clarify that CcPAL2 was the only one of the three proteins whose sequence was branched with other PAL sequences from the Asterids, the clade to which
C. canephora belongs.
CcPAL2 has been located within a coffee linkage group region that is syntenic to tomato (observation based on the results obtained by Lefebvre-Pautigny et al.
2010). This observation strongly suggests that
CcPAL2 and the corresponding tomato
PAL gene both derive from a common
PAL ancestor as do, most probably, the other Asterid genes whose protein-derived sequence branched together on the phylogenetic tree (Fig. ).
CcPAL1 and
CcPAL3 could be specific
Coffea paralogs produced from an ancient duplication of
CcPAL2, resulting in one of the two genes being followed by a more recent duplication of the duplicated paralog, since
CcPAL1 and
CcPAL3 seem to branch very closely on the phylogenetic tree. Both
CcPAL2 and
CcPAL3 have a phase 1 intron.
CcPAL2 being the ancestral form, it might be assumed that
CcPAL3 results from the first duplication. As
CcPAL1 carries a phase 2 intron, it may then be alleged to be the most recent paralog, resulting from a duplication of
CcPAL3. Interestingly, CcPAL1 and CcPAL3 proteins were found grouped with PtrPAL2, PtrPAL4 and PtrPAL5, three
P. trichocarpa proteins encoded by genes more specifically expressed in xylem and carrying, in their promoters, five core motifs similar to elements which are known to regulate phenylpropanoid gene expression (Shi et al.
2010a,
b). This result suggests that
CcPAL1 and
CcPAL3 may be co-expressed with genes of the phenylpropanoid pathway that lead to monolignol biosynthesis. As both paralogs seemed to be the product of duplications, lignification in woody plants could be considered as a derived function from an ancestral one from the phenylpropanoid pathway, such as flavonoid biosynthesis, the first plant protection against UV light. This observation highlights that it would be informative to isolate the 5′ untranslated transcribed region (UTR) and the promoter region sequences of the three
CcPAL paralogs to acquire meaningful additional data on their specific functions and their transcriptional control, often exercised by
R2R3-
MYB transcription factors in monolignol and flavonoid synthesis (Stracke et al.
2007; Bomal et al.
2008; Luo et al.
2008).
The present work, dedicated to the family of coffee
PAL genes, has successfully led to identifying, characterizing and mapping three genes. Their differential expression, and particularly the association established between the high expression observed for
PAL1,
PAL3 and
HQT genes (Fig. ) and the high CGA accumulation in the immature coffee bean, is a preliminary step toward characterizing their functions. In further research, it would be interesting to assess if all three coffee PAL proteins are biochemically active in vitro, by producing them in
Escherichia coli, and then testing their ability to catalyze the deamination of
l-phenylalanine to form
trans-cinnamic acid (Reichert et al.
2009). As a key step, studying coffee
PAL expression profiles and segregation in different
C. canephora varieties and in their offspring after crossing, and quantifying the related phenylpropanoid levels for association studies, will be particularly useful for advancing coffee breeding programs. The use of such genetic markers encoding proteins associated with flavonoids and CGA accumulation could carry substantial interest for the selection of
C. canephora varieties with improved traits, e.g., varieties rich in antioxidant compounds beneficial to human health or with improved organoleptic quality.