|Home | About | Journals | Submit | Contact Us | Français|
Although the complete genome sequence and annotation of Arabidopsis were released at the end of year 2000, it is still a great challenge to understand the function of each gene in the Arabidopsis genome. One way to understand the function of genes on a genome-wide scale is expression profiling by microarrays. However, the expression level of many genes in Arabidopsis genome cannot be detected by microarray experiments. In addition, there are many more novel genes that have been discovered by experiments or predicted by new gene prediction programs. Another way to understand the function of individual genes is to investigate their in vivo expression patterns by reporter constructs in transgenic plants which can provide basic information on the patterns of gene expression.
A high throughput pipeline was developed to generate promoter-reporter (GFP) transgenic lines for Arabidopsis genes expressed at very low levels and to examine their expression patterns in vivo. The promoter region from a total of 627 non- or low-expressed genes in Arabidopsis based on Arabidopsis annotation release 5 were amplified and cloned into a Gateway vector. A total of 353 promoter-reporter (GFP) constructs were successfully transferred into Agrobacterium (GV3101) by triparental mating and subsequently used for Arabidopsis transformation. Kanamycin-resistant transgenic lines were obtained from 266 constructs and among them positive GFP expression was detected from 150 constructs. Of these 150 constructs, multiple transgenic lines exhibiting consistent expression patterns were obtained for 112 constructs. A total 81 different regions of expression were discovered during our screening of positive transgenic plants and assigned Plant Ontology (PO) codes.
Many of the genes tested for which expression data were lacking previously are indeed expressed in Arabidopsis during the developmental stages screened. More importantly, our study provides plant researchers with another resource of gene expression information in Arabidopsis. The results of this study are captured in a MySQL database and can be searched at http://www.jcvi.org/arabidopsis/qpcr/index.shtml. Transgenic seeds and constructs are also available for the research community.
At the end of 2000, the first plant genome project, sequencing of the whole genome of Arabidopsis thaliana, was completed by a multinational collaborative effort . Subsequently, the plant scientific community set the goal of understanding the function of each gene in the Arabidopsis genome, which is encapsulated in the National Science Foundation (NSF) 2010 program, and this challenge has been taken up by many Arabidopsis researchers . Identification of each gene in the fully sequenced Arabidopsis genome and uncovering their function will provide crucial information for biologists to understand plant physiology, genetics, development and evolution. One way to understand the function of genes on a genome-wide scale is expression profiling by microarrays using formats that include complementary DNA , oligonucleotides [4,5] or amplicons . Recently, whole genome tiling arrays have been developed and used to interrogate the gene structure and expression status of the entire Arabidopsis genome [7-9]. In addition, microarrays have been used to analyze genome features such as chromatin structure [10,11]; sites of DNA modifications [12,13]; and DNA-protein binding sites [14,15]. So far, the expression data from ATH1 chips covers a wide range of experimental treatments and conditions in the public domain http://affymetrix.arabidopsis.info/narrays/experimentbrowse.pl; http://arabidopsis.org/info/expression/ATGenExpress.jsp that collectively make a great contribution to understanding gene function in Arabidopsis. Another way to understand the function of individual genes is to investigate their in vivo expression patterns spatially, temporally, or conditionally by reporter construct methods that include 1) random integration of promoter-less reporter constructs into the genome and monitoring of their expression to identify genes, promoters, and enhancers ; 2) transformation of promoter-reporter fusions from a particular gene back into the organism and observing the reporter expression patterns [17-22].
Many studies on gene expression in plants have shown that the majority of elements necessary for and important for the regulation of expression lie immediately upstream of the transcriptional initiation site, usually within the first kilobase (17-21). Although there are reports showing that the gene regulatory signals may also be located in other regions (e.g. within introns or the 3' UTR) [23-25], these are in a minority. Thus the traditional view that the majority of a plant promoter's activity as being immediately upstream of the transcriptional initiation site is generally well supported and there is considerable evidence that the in vivo expression of reporter genes driven by such regions (1-2 kb upstream) does reflect the expression pattern of the native gene. It can thus provide preliminary but critical information on endogenous gene expression patterns and evidence of their biological or developmental functions. A transgenic system such as this can also reveal cell-type-specific patterns of gene expression (e.g. in trichomes, hydathodes or stomata) without a priori knowledge which would escape detection by microarray approaches.
In our current study, we focused on two groups of genes in Arabidopsis, 1) unannotated genes identified by our previous experiments  that were predicted by EuGene  and Twinscan , programs that incorporate comparative genomics information. 2) Annotated Arabidopsis genes whose expression was detected in less than 5% of ~1,400 ATH1 Affymetrix GeneChip experiments downloaded from The Arabidopsis Information Resource (TAIR) website at the start of this work. For both groups of genes, our knowledge of their function is extremely limited. In our previous study that focused on hypothetical gene structure in Arabidopsis, localized expression patterns of five exemplar genes were detected by cloning the promoter regions into green fluorescent protein (GFP) reporter constructs and transforming them into Arabidopsis . For the present study, we developed a high throughput pipeline to create promoter-reporter (GFP) fusions of both gene groups, transformed them into Arabidopsis, and screened the expression of the reporter genes in positive transgenic plants at various developmental stages. Over 600 Arabidopsis genes were selected to make the promoter-reporter constructs and transgenic lines were obtained for just under half of these. The expression of promoter-reporter (GFP) constructs in transgenic plants was examined at 4 developmental stages: on the selection plate around 10 days after germination, at the rosette stage in soil, just before flowering, and at the flowering stage. GFP expression has been observed in constructs from 150 different promoters and is typically localized to a few tissues or cell types (e.g. hydathode, pedicel, socket cell, guard cell), consistent with the absence from or low abundance of transcripts from these genes in EST libraries. For the remaining target genes, no visible expression was detected, although PCR confirmed the presence of the transgene in all cases tested. All the expression patterns have been annotated according to plant ontology codes http://www.plantontology.org/ and stored in a MySQL database. To our knowledge, this is the first set of large-scale promoter-reporter expression data in Arabidopsis focusing on novel genes and genes with limited expression data, and thus should be a valuable resource for the plant research community. All our data are publicly available through the project website http://www.jcvi.org/arabidopsis/qpcr/index.shtml; transgenic seeds, and constructs are also available for research community through the Arabidopsis Biological Resource Center (ABRC).
The selection of target genes was based on the Arabidopsis annotation release 5 that was available when the project started and includes 1) previously identified un-annotated genes; 2) genes represented on the Affymetrix ATH1 GeneChip showing no expression; 3) genes represented on the Affymetrix ATH1 GeneChip showing very limited expression. Un-annotated genes are genes in the intergenic regions whose expression was detected by RACE in our previous experiments and those predicted in intergenic regions by EuGene  and/or Twinscan . Genes on the Affymetrix ATH1 GeneChip showing no or limited expression were identified from the Affymetrix MAS calls of 1381 ATH1 arrays downloaded from TAIR, combined with massively parallel signature sequencing (MPSS) data available at that time  (for details see MATERIAL AND METHODS). The resulting set of genes (identifiers and targeted promoter sequences) is presented in additional file 1: Table S1. Many of these were not annotated at the time of this work. However, all except 27 are now present in the TAIR9 annotation (June 2009) (additional file 2: Table S2). Five of these 27 still unannotated genes showed GFP expression in transgenic plants.
To handle the relatively large number of genes to be studied, we modified our original protocols [29,31] to produce a robust high throughput pipeline that included batch primer design, high throughput cloning and transformation, and a project-tracking LIMS implemented in a MySQL database (Figure (Figure1).1). In this study, the Gateway® Gene Cloning strategy  was used to make promoter-reporter constructs. PCR amplification of promoters and Gateway cloning were performed in 96-well plate format. Two colonies from each Gateway BP cloning reaction (from each promoter) were picked and sequenced to confirm target identity. DNA isolated robotically from the sequence-confirmed clones was used in Gateway LR reactions. The triparental mating method  was used to transfer promoter-reporter constructs from E. coli to Agrobacterium (GV3101). Bypassing the step of destination clone DNA isolation from E. coli made this step economical and large scale. In addition, we cultured only 50 ml of Agrobacterium for plant transformation and did the floral dip in 50 ml Falcon tubes. As shown in Figure Figure2,2, a total of 627 candidate genes were put through the construction and transformation pipeline. Gateway entry clones were obtained from 469 genes and sequence confirmed. Of these, 442 were successfully transferred into the destination vector, pYXT2 containing the GFP reporter gene . Subsequently, Agrobacterium clones from 353 genes were obtained by tri-parental mating, and then transformed into Arabidopsis by the floral dip method . Three independent transformations (floral dips with separate plants) were performed for each construct and 3 seedlings from each kanamycin selection plate (whenever possible) were transferred into soil for maximum possible set of 9 transformed plants per construct. Positive transgenic plants from 266 constructs representing promoters from 266 genes were obtained and GFP expression patterns have been observed from 150 constructs in at least one of the 4 developmental stages examined. This newly developed cloning and transformation pipeline greatly improved the throughput and lowered the effort compared with our previous work.
Additionally, we have developed a project specific MySQL database to record and track the large amount of data generated by this project, including target genes, promoter selection, primer design, different steps of clone tracking, transgenic plant tracking, and GFP image capture. All the GFP images were annotated according to Plant Ontology (PO) codes http://www.plantontology.org/index.html. This MySQL database supports our publicly accessible website http://www.jcvi.org/arabidopsis/qpcr/index.shtml which users can search for GFP images by tissue type, locus name, or PO code.
In total, 1,885 positive transgenic lines from 266 promoter-reporter constructs were obtained. However, only 1,457 transgenic plants from 239 genes survived the process of transferring plants from selection plates to soil. Among them, 761 transgenic plants from 150 promoters showed GFP expression. Of these 150 promoters, consistent GFP expression patterns were observed in two or more transgenic plants for 112 promoter constructs. For 19 promoter constructs, only one transgenic line was recovered and showed GFP expression. For 14 promoter constructs, more than one transgenic line was obtained, but only one line expressed GFP. In addition, plants from 5 promoters had inconsistent GFP expression patterns. These results are summarized in Figure Figure2.2. Among the 112 promoter constructs that displayed the same GFP expression pattern from more than one transgenic plant, promoters from 79 genes showed the same GFP expression patterns in transgenic plants obtained from different floral dip events, providing high confidence of the location of expression of the transgene. GFP expressing lines for 27 different constructs were randomly chosen for promoter re-amplification using flanking primers and the PCR products were sequenced. All were confirmed to be from the intended promoters as shown in additional file 2: Table S2. In addition, leaf PCR  using GFP primers was performed on 423 transgenic lines, from 83 promoters, that did not show GFP expression. All had positive amplification indicating the presence of the GFP construct in these lines. In total, we recorded 2,287 GFP expression images and described their expression patterns with Plant Ontology (PO) codes. A total of 3,371 PO codes were assigned, as more than one PO code referencing different parts of the plant could be assigned to one image, and collectively these represented eighty-one different expression patterns (Table (Table1).1). The leaf vascular system was the most frequently annotated code and was assigned to 38 promoters. For example, the Twinscan-predicted At.chr1.16.7 promoter construct shows expression in vascular tissue from leaf (PO:0000036, Figure Figure3A),3A), petal (PO: 0000054) and sepal (PO:0004723, Figure Figure3B).3B). Examples of multiple PO codes/expression patterns from one construct include the novel_chr1_5915494 (a novel gene discovered in our previous project) which is expressed in root (PO:0009005, Figure Figure3C)3C) and seed (PO:0009010, Figure Figure3D).3D). Some expression patterns which we detected are very specific. The promoter-reporter construct from AT1G64820 is expressed throughout the root but excluding the root tip (Figure (Figure3E),3E), while the construct from AT4G13985 was expressed only in the root tip (Figure (Figure3F).3F). Many instances of GFP expression were detected in floral organs, such as petal expression from promoter-report construct of AT2G17845 (Figure (Figure3G)3G) and carpel expression from that of AT2G40250 (Figure (Figure3H).3H). A listing of all genes assigned to each PO code is provided in additional file 3: Table S3
The aim of our study was to examine the in vivo expression of many Arabidopsis genes of unknown function that show little or no expression based on EST, MPSS, or microarray data using reporter (GFP) genes driven by their native promoter and thus learn something about their potential function. In this study, we selected 627 Arabidopsis genes with unknown function, based on TIGR5 genome annotation, including unannotated genes located in the intergenic regions revealed in our previous RACE experiments , intergenic genes predicted by EuGene  and/or Twinscan , and annotated genes with zero or less than 5% "present" gene calls from the 1,381 ATH1 arrays available at the start of these experiments. We developed a high throughput pipeline to generate promoter-reporter (GFP) constructs and transform them into Arabidopsis.
At the outset, three Gateway compatible binary vectors were compared for the expression level of their reporter genes and their convenience for this project: 1) pBGWFS7 http://gateway.psb.ugent.be/search/index/transcriptional_reporters/any which contains dual reporter genes (GFP and GUS) with BASTA selection, 2) pYXT1 which contains a GUS reporter gene with kanamycin selection, and 3) pYXT2 which contains the eGFP reporter gene  with kanamycin selection. We used both pYXT1 and pYXT2 vectors successfully in our previous study . For comparison, we cloned 5 promoters into each of these vectors and found that, although the dual reporter vector, pBGWFS7 that contains the BASTA selection marker is simplest for selection of the transgenic lines, only GUS expression and no GFP expression was detected. For pYXT1 and pYXT2 vectors, both GUS and GFP expression were detected equally well. However, we chose to use the pYXT2 vector because it eliminates the need for GUS staining that would increase the amount of effort required in these experiments.
A high throughput method was developed to facilitate the ease of cloning and transformation over traditional gene-by-gene methods such as that used previously . It includes a Perl script to design Gateway-compatible primers for cloning the promoter regions of candidate genes, PCR amplification of promoters of candidate genes, Gateway BP cloning, and Gateway LR cloning all in 96-well format, Agrobacterium transformation by tri-parental mating and floral dipping in 50 ml Falcon tubes (see MATERIALS AND METHODS). The 5' ("left") primer is located at least 2000 bp upstream of the start position of the coding sequence of the candidate genes since many studies have shown that region 1-2 kb upstream of the translation start site of genes determines the specificity of gene expression [39-44]. The 3' ("right") primers for the promoters were all located between 50 bp and 150 bp downstream of the translation start site of the target genes allowing an in-frame fusion with the GAL4-VP16 component of the reporter construct that contained at least 16 amino acids of the studied gene's coding sequence as well as the 8 amino acid linker from Gateway system. This script made the primer design process facile, reliable and consistent.
Positive BP clones were picked and sequenced to confirm the cloned promoter sequences. Residual BP DNA from the sequence confirmation step was used for the Gateway LR cloning reaction in 96-well plate format. The triparental mating method was chosen to transfer the Gateway expression clones from E. coli to Agrobacterium [33,45], eliminating the need for DNA isolation from the selected clones as well as the subsequent high throughput electroporation. For the plant transformation, we cultured only 50 ml Agrobacterium, which allowed all steps (precipitation and resuspension of Agrobacterium cells and floral dipping of Arabidopsis plants) to be carried out in single 50 ml Falcon tubes. Recently, Davis et al  successfully transformed Arabidopsis by dipping plants directly into Agrobacterium cultures supplemented with surfactant, eliminating the need for media exchange to a buffered solution and further simplifying the transformation process.
A very important feature of this pipeline was the creation of a project specific laboratory information management system (LIMS) in a MySQL relational database to track all stages of the pipeline: candidate genes, primers, different stages of clone construction, transgenic lines, GFP checking at different stages, PCR and sequencing results and annotation of all the GFP images.
Therefore, this high throughput cloning and data tracking pipeline made our project management efficient and robust. However, out of 627 targeted promoters, only 266 constructs were ultimately transferred into Arabidopsis plants. This overall success rate is due to a degree of failure at each experimental step. For example, the successful rate for obtaining entry clones by BP cloning is 74.8%, for expression clones from entry clones by LR cloning 94.4%, for Agrobacterium clones by triparental mating 79.7%, and for kanamycin-resistant transgenic plants by floral dip 75.4%. Because of the nature of this high throughput project, we have not yet repeated any experimental step. Certainly the overall successful rate will increase if the unsuccessful clones at each step are reprocessed.
In order to confirm that the expression patterns are from the intended cloned promoters, vector-based primers flanking the cloning site were used to amplify the cloned promoters from transgenic plants showing GFP expression and the PCR products were sequenced for confirmation. Of 27 different constructs showing GFP expression, all the promoters were verified as correct. However, since not all lines were tested, researchers may wish to perform their own confirmation before using our lines. There are a total of 112 promoter-reporter constructs that show the same expression in more than one plant. For 79 promoter-reporter constructs, the same GFP expression patterns were observed from transgenic lines derived from independent floral dips and the other 33 constructs produced the same GFP expression patterns from separate seed borne by a single dipped plant are thus the most reliable data set. It has been shown that female reproductive tissues are the primary target of Agrobacterium-mediated transformation and that the transformants derived from the same seed pod contain independent T-DNA integration events .
The validity of the specific patterns of GFP expression from a representative set of promoters was confirmed by quantitative real-time PCR (qRT-PCR) on RNA samples from multiple tissues (additional file 4: Table S4). In every case, the tissue showing the highest expression (lowest Ct) by qRT-PCR was the one from which GFP expression was observed, and in almost every case this expression value was many times higher than any of the other tissues examined.
In this study, we checked GFP expression at 4 different stages: on the selection plate around 10 days after germination, at the rosette stage in soil, just before flowering, and at the flowering stage. These stages were chosen both to cover several developmental stages and also for the convenience of the large amount of GFP screening and to and minimize the stress for the T0 transgenic plants (e.g. checking root GFP on the selection plates and during transplanting to soil). If the kanamycin-selected transgenic plants did not show any GFP expression at any of the stages examined, PCR with GFP specific primers was used to confirm the presence of the reporter transgene. Out of 256 plants representing 89 promoter reporter constructs without GFP expression that we tested by leaf PCR , all were positive with GFP primers. There are several possible reasons for the lack of detectable GFP expression in these lines. The promoter might be active only under conditions or at specific developmental stages not examined in this study. Alternatively, in contrast to the localized expression seen with many of the promoters, those without visible GFP expression may in fact be expressed in the plant but at levels too low to be detected by this method. It is also possible that some of the promoter-reporter constructs were truncated or rearranged during T-DNA integration , or that gene silencing occurred . In addition, the inconsistent GFP expression patterns that were detected from different transgenic lines of 5 promoter constructs may be due to position effects or to truncation or re-arrangement of the constructs during transformation as well as to human error.
The goal of this project was to use the expression of promoter-reporter constructs in transgenic plants to infer the function of these no/low expression genes. Promoters from 35 genes tested had GFP expression in hydathodes, a secretory structure on leaf margins. An example of hydathode expression from promoter-report construct of gene AT02EUG13430 is shown in Figure Figure3K.3K. Studies have shown that some genes expressed in hydathodes are related to plant tolerance to toxicity. For example, the Bot1 gene in barley is responsible for boron-toxicity tolerance , the MTP11 gene in Arabidopsis is associated with plant tolerance to manganese , and AtHMA3, a P1B-ATPase protein plays a role in the detoxification of heavy metals . AtCML9, a calmodulin-like protein from Arabidopsis thaliana, can alter plant responses to abiotic stress and abscisic acid and the expression of its promoter-reporter construct also included hydathodes . In addition, hydathodes are one of the expression locations of a promoter-reporter construct from ECA3, a Golgi-localized P2A-type ATPase that plays a crucial role in manganese nutrition in Arabidopsis . Thus, it is possible that some the genes of unknown function analyzed in this study that show hydathodes expression are also involved in tolerance or detoxification pathways, suggesting a direction for further study. Motif search by Multiple Em for Motif Elicitation (MEME)  for all promoter sequences with hydathode expression found a motif of CTTAAGA (P = 8.67e-09). However, its function and specificity will require experimental verification.
Twenty six promoter-reporter constructs expressed GFP in the abscission zones of siliques, flowers and leaves including expression around the flower abscission zone from construct of gene AT4G18395 (Figure (Figure3J).3J). Abscission is a physiological process that involves the programmed separation of entire organs, such as leaves, petals, flowers, and fruit, allows plants to discard nonfunctional or infected organs, and promotes dispersal of progeny . Promoter-reporter constructs from a number of confirmed abscission related genes including BOP1 , BFN1 [55,56], HAE, HSL2, MKK4,5 , AtZFP2  show similar expression at abscission zones. Using MEME , the sequence TAACCACTCA was the most significant motif found in the promoters analyzed in this study.
Thirty-six promoter-reporter constructs are expressed in trichomes or the socket cells that surround a trichome and provide support, suggesting their possible function in trichome development, expansion and branching. Many promoter constructs in our study were expressed in specific floral organs, including sepal, petal, filament, anther, carpel, and pollen. For example, pollen specific GFP expression was detected from the construct of gene AT2G24370 (Figure 3I, L). In addition to providing the clues to their function, they may also provide novel promoters for plant genetic engineering. For example, it has been shown that completely sterile Arabidopsis plants can be generated by engineering carpel and stamen-specific expressed genes . Use of the Ory s1 promoter (pollen-specific promoter) with antisense Lol p5A cDNA led to the production of hypoallergenic rye grass (Lolium perenne) .
Overall, in our study, positive transgenic plants were obtained from 266 promoter constructs derived from our intergenic and non- or low-expressing genes of unknown function. Among them, about 56% of constructs showed GFP expression in Arabidopsis. Thus the in vivo expression data from promoter-reporter constructs generated in this study has provided insights into possible functions of many genes previously lacking both expression data and functional annotation as well as another great gene expression resource for the research community.
ATH1 no- or low-expression genes were identified as follows. The results of 1381 ATH1 arrays were downloaded from TAIR and the Affymetrix MAS calls used to classify them as "expressed" (present call) or "non-expressed" (marginal or absent call) in each experiment. "ATH1 no expression genes" and "ATH1 low expression genes" are those showing expression in either none or in less than 5% of the experiments respectively. Candidate genes from the microarray analysis were excluded if they were shown by massively parallel signature sequencing (MPSS)  to be expressed.
The Perl script for primer design for promoter cloning has following features 1) align the CDS of the candidate genes to the Arabidopsis genomic sequence to locate the ATG start codon using BLAT; 2) extract 3000 bp upstream of the ATG and 150 bp downstream of the ATG as the genomic target region; 3) invoke Primer3 http://primer3.sourceforge.net/ to design primers using PRIMER_PRODUCT_SIZE_RANGE '2150-2650' and TARGET '3000, 50' which locates the "left" primer at least 2000 bp upstream of ATG start codon and the "right" primers at least 50 bp downstream of the start codon; 4) check the specificity of potential primer pairs against the Arabidopsis genome using BLAST  using 80% identity and allowing a maximum of 3 hits in the genome per pair of primers (it will hit itself in the genome). The script iterates through primer pair design until these criteria are fulfilled. In addition, the user can directly input Gateway sequences at the ends of upstream and downstream primers (aaaaagcaggct is added to the 5' end of upstream primers and agaaagctggt to the 5' end of downstream of primers). Thus, the output of this script will be the primers with gene specific sequences at 3' ends and Gateway cloning sequences at 5' ends in the format for plate ordering according to the primer manufactures' requirements. This primer design script is available request to the authors.
The Gateway cloning strategy was used to make promoter-reporter constructs largely according to the protocols in the Gateway Cloning Technology booklet (Invitrogen, Carlsbad, CA). Promoter amplifications were performed in 96 well plate format and in 2 PCR stages in order to add Gateway cloning sequences at the ends. The first PCR reaction contains 13.4 μL H2O, 4.0 μL 5× HF buffer, 0.40 μL 10 mM dNTPs, 2.0 μL gDNA (10 ng/uL) and 0.2 μL Phusion/iProof enzyme (New England Bio Lab, Ipswich, MA). The PCR conditions are: 1×: 98°C 30 sec, 1×: 98°C 10 sec, 63°C 30 sec, 72°C 2 min, 1×: 98°C 10 sec, 62°C 30 sec, 72°C 2 min, 1×: 98°C 10 sec, 61°C 30 sec, 72°C 2 min, 1×: 98°C 10 sec, 60°C 30 sec, 72°C 2 min, 1×: 98°C 10 sec, 59°C 30 sec, 72°C 2 min, 9×: 98°C 10 sec, 58°C 30 sec, 72°C 2 min, 1×: 72°C 10 min, 1×: 4°C forever. The primers used in the 1st PCR reaction were from the primer design script and contain gene-specific sequence as well as the attB site sequence for BP cloning. The second PCR reaction contains: 18.8 μL H2O, 6 μL 5× HF buffer, 0.8 μL 10 mM dNTPs, 4.0 μL AttB Primer mix10 uM, 0.4 μL Phusion/iProof (New England Bio Lab, Ipswich, MA), and 10 μl from the PCR1 reaction was used as template. The 2nd PCR conditions are: 1×: 98°C 30 sec, 19×: 98°C 10 sec; 56°C 30 sec; 72°C 2 min, 1×: 72°C 10 min, 1×: 4°C forever. The universal Gateway adaptor primers used in the 2nd PCR reactions are following: attB1adaptor_primer: 5'-GGGGACAAGTTTGTACAAAAAAGCAGGCT-3', attB2adaptor_primer: 5'-GGGGACCACTTTGTACAAGAAAGCTGGGT-3'. Five μl from the 2nd PCR reaction is diluted using H2O into 25 μl and 1 μl of diluted PCR product is used in the Gateway BP reaction. Gateway BP reactions were done in 96 well plates. Each reaction contains: 1 μl of diluted PCR product, 0.75 μl TE buffer (pH 8.0), 0.25 μl pDONR207 (150 ng/ul), 0.5 μl BP Clonase II mix. The reaction mixture was incubated at 25°C for from 1 hour to overnight depending on the experimental schedule. After incubation, 0.3 μL proteinase K was added to each reaction and incubated at 37°C for 10 min. Transformation was done in 96 well plates as follows: TOP10 competent E. coli cells (Invitrogen, Carlsbad, CA) were thawed on ice and 24 μL of cells was added into each BP reaction and incubated on ice for 20 min. Cells were heat shocked at 42°C for 30 sec and put on ice for 2 min, then 175 μL room temp SOC (Invitrogen, Carlsbad, CA) was added to each sample that was then incubated at 37°C, 225 rpm for 1 hour. Fifty μL of each transformation culture was plated on LB plates containing gentamycin (7 μg/mL) using glass beads in 4 Well Rectangular MutiDish w/Lid (Thermo Fisher Scientific, Rochester, NY). For each reaction, two positive colonies were picked and sequenced to confirm the cloned promoter. One μl of DNA from each sequence-confirmed BP clone (robotically isolated in the sequencing center) was used in a Gateway LR reaction using pYXT2 as destination vector in 96 well plates. Each reaction contained: 1 μl of entry clone DNA, 0.75 μl TE buffer (pH 8.0), 0.25 μl pYXT2 (150 ng/μl), 0.5 μl LR Clonase II mix. Incubation time at 25°C was 1 hour to overnight depending on the experimental schedule. LR clone selection was the same as BP clone selection except for the substitution of kanamycin (50 μg/mL) for gentamycin in the LB plates.
For triparental mating, a 50 mL culture of Agrobacterium GV3101 in LB with rifampicin (50 μg/mL) and gentamicin (50 μg/mL) was grown overnight at 28°C, 250 rpm and a 50 mL culture of pRK2013 E. coli helper strain in LB with Kan (50 μg/mL) was grown overnight at 37°C, 250 rpm. At the same time, 250 μl cultures of LR clones were incubated overnight in LB medium with kanamycin (50 μg/mL) at 37°C in deep well blocks. The following day, 50 μL Agrobacterium culture, 50 μL E. coli helper strain containing pRK2013 plasmid, and 50 μL E. coli LR clone culture were plated together on LB agar in each compartment of 4 Well Rectangular MutiDish w/Lid (Thermo Fisher Scientific, Rochester, NY) for each LR clone. The following day, from the bacterial lawn, a small loopful of bacteria from each well was streaked onto LB plates containing rifampicin (50 μg/mL), gentamicin (50 μg/mL) and kanamycin (50 μg/mL) to select for Agrobacterium with the pYXT2 construct containing the target promoters. Two positive colonies from each reaction were inoculated into corresponding positions of two 96 well blocks with 250 μL LB media containing rifampicin (50 μg/mL), gentamicin (50 μg/mL), kanamycin (50 μg/mL) and grown at 28°C, 225 rpm for 2 days. PCR was then used to confirm the positively selected Agrobacterium colonies as follows: Twenty μL of 20 mM NaOH solution were added into a new PCR plate, then 3 μL Agrobacterium culture were added into each well. The PCR plate was sealed and incubated at 37°C for 5 minutes. Then 2 μL of NaOH treated Agrobacterium culture were transferred into new PCR plate and a PCR reaction was set up by adding 4.0 μL H2O, 10 μL 2× PCR mix (New England Bio Lab, Ipswich, MA), 4.0 μL promoter-specific primer pair (2.5 μM). The PCR conditions are: 1×: 95°C 2 min, 30×: 94°C 30 sec, 55°C 30 sec, 72°C 3 min, 1×: 72°C 5 min, 1×: 4°C forever. The positive colonies were used in plant transformation.
Arabidopsis thaliana ecotype Columbia-0 was used in all our experiments. Plants were grown on Redimix at 25°C and 24 hours light (cool white fluorescent; ~150 microeinsteins). Plant transformation and seed selection was done according to standard methods  except that only 50 ml of the Agrobacterium was cultured and used in floral dipping so that all subsequent manipulations up to and including floral dipping were done in 50 ml Falcon tubes (Becton Dickson Labware, Franklin Lakes, NJ). Three independent transformations were done for each construct (i.e. three plants were dipped separately in the same Agrobacterium tube) and up to 3 plants from each kanamycin (50 μg/mL) selection plate were transferred into soil.
Plants were observed using an Olympus SZX12 stereomicroscope equipped with a 100 W mercury lamp for epifluorescence and a parfocal 1.6× objective. Images were recorded with an Olympus DP71 digital camera.
Leaf RNA was isolated from Plant basal rosette leaves from plants about 21 days after germination. Flower RNA was isolated opened flowers and unopened flower buds. Young silique RNA was isolated from immature siliques; root RNA isolation has been described previously . RNA was extracted using TRIzol (Invitrogen, Carlsbad, CA, USA) as described by the manufacturer and then filtered using RNeasy columns (Qiagen, Valencia, CA, USA). First-strand cDNA was synthesized by priming with oligo-dT using SuperScript III reverse transcriptase (Invitrogen Carlsbad, CA, USA) following the instructions of the provider. PCR reactions were carried out in an ABI PRISM® 7900 HT Sequence Detection System (Applied Biosystems, Foster City, CA, USA). SYBR® Green was used to quantify dsDNA synthesis. Reactions (10 μl total volume) were amplified using the following standard PCR protocol: 50°C for 2 min; 95°C for 10 min; 40 cycles of 95°C for 15 sec and 60°C for 1 min, and SYBR® Green fluorescence was measured continuously. Three biological and 2 technical replicates were used for analysis. PCR efficiency was estimated using LinReg software with data obtained from the exponential phase of each individual amplification plot. Cycle time (Ct) values were taken at a threshold value of 0.2.
The authors declare that they have no competing interests.
YX participated in the experimental work and wrote the manuscript. JCR, ELM, BAU, WAM, and WW performed the experiments. JZ and HCW carried out bioinformatic and database-related tasks. CDT conceived and coordinated the project and contributed to the manuscript.
All authors have read and approved the final manuscript.
Table S1. Identifiers and promoter sequences og genes used in this study.
Table S2. Listing of all genes used in this study with their TAIR identifiers (where available) and the success status at each step in the transgenic plant production pipeline.
Table S3. Listing of all genes assigned to each Plant Ontology (PO) code.
Table S4. Validation of GFP expression patterns by quantitative real time RT-PCR.
We thank all members of the plant group at TIGR/JCVI for their help and Dr. Harry J Klee for providing us the E. coli helper strain for triparental matings. This work was supported by the National Science Foundation (grant no. 0520022).