Plants have evolved extremely diversified gene families as tools to cope with a harsh environment. Some of these families such as cytochromes P450 and UDP-glycosyltransferases (UGT) reflect the extraordinary biochemical versatility of plants and across plant species, and represent a very valuable source of genes for biotechnologies. Both gene families offer a huge potential for bioremediation and control of crop and weed pesticide tolerance [1
], but obviously also for industrial applications. P450s, considered as the most versatile catalysts known [4
], usually activate dioxygen and transfer one of its atoms into various substrates, but also catalyze a great diversity of reactions ranging from C-C and C=N bond cleavage, phenolic coupling, dehydration, dehydrogenation, isomerizations to reduction [5
]. Many of these reactions are important for the biosynthesis of hormones, drugs, pigments, aromas, biopolymer building blocks and defense molecules [6
]. Glycosyltransferases are also essential for the production of natural compounds since they control their solubility, stability, transport, storage and sometimes also their bioactivity [8
]. Should some of this potential become directly accessible through genomewide sequencing, extensive information is restricted to model plants, usually with a small genome, or to plants with a major economical interest. Exploitation of this knowledge to target genes of other plants that need to be studied or engineered, or to explore gene families in plants with specific biosynthetic capacities is an objective for the next several years.
With the growing availability of gene sequences plus information regarding their diversity and phylogeny, increasingly sophisticated PCR techniques have been developed to target gene families. Plant P450s are low abundant membrane-bound and unstable proteins, usually difficult to purify. For this reason, early on, several groups attempted isolation of P450 genes on the basis of the most conserved consensus regions, after generating probes by conventional PCR at low stringency [10
]. This approach was later refined and used by several other groups for isolation of P450 genes in various plant species [e.g. [14
]]. It proved successful in many cases, although only leading to a small number of highly expressed and related P450 families. A significant step forward resulted from coupling degenerate PCR with a heme binding primer and differential display of the amplified fragments, an approach that allowed effective identification of nine P450 genes responsive to elicitor treatment of soybean cell cultures [19
] and 21 unique P450 genes in Taxus
cells induced for taxol production [20
]. A carefully controlled and strongly differential system is however needed for such an approach. Another interesting improvement was recently reported that involves use of nested primers to increase PCR selectivity [21
]. However, the major limitation of all the strategies reported so far is that they did not take into account the huge diversity and low conservation of P450s recently revealed by genome sequencing in higher plants, and allowed neither focused gene selection nor isolation of the most divergent P450 clades, i.e. no systematic exploration of the P450 superfamily in highly divergent species.
In this paper we report on the high potential of the recently described COnsensus-DEgenerate Hybrid Oligonucleotide Primers (CODEHOP) strategy [22
] of primer design, ensuring optimal match and PCR amplification focused on very short conserved sequences, for the isolation of orthologues in evolutionarily distant species and for the focused or systematic exploration of gene families in plants with a very large genome. The method was tested to analyze both the duplication and conservation of the CYP98 family of P450 genes in many plant species. This family was recently suggested to play an essential role in lignification and plant development [23
]. The same approach was also used for the analysis of the UGT85 family in wheat.