|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: SZ LL. Performed the experiments: SZ. Analyzed the data: SZ LL. Wrote the paper: SZ LL.
microRNAs (miRNAs) are small, endogenous RNAs of 20~25 nucleotides, processed from stem-loop regions of longer RNA precursors. Plant miRNAs act as negative regulators of target mRNAs predominately by slicing target transcripts, and a number of miRNAs play important roles in development. We analyzed a number of published datasets from Arabidopsis thaliana to characterize novel miRNAs, novel miRNA targets, and miRNA-regulated developmental changes in gene expression. These data include microarray profiling data and small RNA (sRNA) deep sequencing data derived from miRNA biogenesis/transport mutants, microarray profiling data of mRNAs in a developmental series, and computational predictions of conserved genomic stem-loop structures. Our conservative analyses identified five novel mature miRNAs and seven miRNA targets, including one novel target gene. Two complementary miRNAs that target distinct mRNAs were encoded by one gene. We found that genes targeted by known miRNAs, and genes up-regulated or down-regulated in miRNA mutant inflorescences, are highly expressed in the wild type inflorescence. In addition, transcripts upregulated within the mutant inflorescences were abundant in wild type leaves and shoot meristems and low in pollen and seed. Downregulated transcripts were abundant in wild type pollen and seed and low in shoot meristems, roots and leaves. Thus, disrupting miRNA function causes the inflorescence transcriptome to resemble the leaf and meristem and to differ from pollen and seed. Applications of our computational approach to other species and the use of more liberal criteria than reported here will further expand the number of identified miRNAs and miRNA targets. Our findings suggest that miRNAs have a global role in promoting vegetative to reproductive transitions in A. thaliana.
Plant microRNAs (miRNAs) are involved in multiple developmental and physiological processes and negatively regulate gene transcript abundance through post-transcriptional repression of mRNAs, primarily by target cleavage . MiRNAs are generated from endogenous loci that produce transcripts with internal stem loop structures that are processed to 20~25nt small double stranded RNAs. Many proteins of the miRNA biogenesis pathway are known. RNA polymerase II generates pri-miRNAs which are stabilized by DAWDLE (DDL) . Pri-miRNAs are converted to stem-loop pre-miRNAs by DICER-LIKE1 (DCL1)  which interacts with the double stranded RNA-binding protein HYPONASTIC LEAVES1 (HYL1)  and SERRATE (SE) . The pre-miRNAs are processed either to 21 nt mature miRNA/miRNA* duplexes by DCL1  or to 24 nt mature miRNA duplexes by DCL3 . The mature miRNA duplexes are methylated by the S-adenosyl methionine-dependent methyltransferase HUA ENHANCER1 (HEN1) . HASTY (HST) is the plant homolog of mammalian EXPORTIN 5 which is known to export pre-miRNA . One strand of the methylated miRNA/miRNA* duplex is preferentially assembled with the ARGONAUTE1, AGO1 protein, which slices the mRNA targets at the miRNA target site . Despite the existence of this core pathway, its mutants have different effects on miRNA abundance and miRNA target abundance. Hypomorphic dcl1 mutants have undetectable or very low levels of mature miRNAs , and miRNA accumulation is similarly undetectable or very weak in null hen1 and hyl1 mutants , . Although HST is frequently depicted as an exportin , its function has not been validated, and hst mutants fail to accumulate miRNAs in the nucleus . In hst mutants, some miRNAs accumulate to wild-type levels, others have reduced abundance, and others are not detectable .
Over 184 Arabidopsis miRNAs have been identified (miRBASE release 10.0) . miRNAs have been predicted to regulate the expression of more than 600 genes , and 225 genes are known targets . miRNAs and their targets have been discovered using approaches including cloning and sequencing sRNAs , , bioinformatic prediction from genomic sequences , and sequencing cleaved mRNAs by parallel analysis of RNA ends (PARE) . Utilizing miRNA biogenesis/transport mutants for miRNA and miRNA target identification has not been widely employed. First, many miRNA target transcripts do not accumulate when their cognate miRNAs decrease. In one study of hen1 and dcl1 mutants, miRNA target abundances were greater than wild type in eight of eight known targets examined ; while in another study, five of ten transcripts were significantly up-regulated . Two of three miRNA targets accumulate within the hst-1 mutant , and five of eight target transcripts had higher abundance in the hyl1 mutant compared to wild-type . Second, a large number of genes are misexpressed in individual mutants causing an abundance of upregulated genes that are not miRNA targets , .
The core function of plant miRNAs is in the regulation of development . Hypomorphic dcl1 mutants and null hyl1, hst, and hen1 mutants reveal both shared and distinct developmental defects that may represent misexecution of developmental transitions . All canonical 21nt miRNA mutants (dcl1, hen1, hst, and hyl1) have a shorter stature, delayed vegetative to reproductive transitions, and reduced female fertility compared to wild type , , , . Hen1 mutants have defects in floral meristem identity, a trait also shared by dcl1 partial loss-of-function mutants , , and the juvenile development phase of hst15 is shortened . The effects of many individual misexpressed miRNA targets also highlight the role of miRNAs in plant development. Misexpression of miRNA target genes leads to plant developmental defects in meristems (HD-ZIPIII genes) , leaves (TCP) , flowers (AP2 genes; CORNGRASS1) , , and roots (NAC1) . Misexpression of targets also affects plant vegetative phase changes , . Despite these phenotypic observations, to our knowledge it is unknown if disrupting plant miRNAs can cause one tissue's transcriptome to resemble that of a different tissue. For example, in animals, delivering miRNAs into cells can cause their transcriptomes to resemble the transcriptome of the tissue where the miRNA is preferentially expressed .
This study performs novel analyses on a number of published data sets to identify miRNAs and their targets and to identify miRNA dependent tissue biases within the transcriptome. We integrate data sets including microarray profiling and sRNA deep sequencing data from miRNA biogenesis/transport mutants, microarray profiling data of mRNAs in a developmental series, and computational predictions of conserved, genomic stem-loop structures. Our results show that known miRNA targets fail to explain transcriptome patterns amongst miRNA mutants. Interestingly, transcript changes in hyl1 and hst mutants greatly overlapped, suggesting a novel functional connection between HST and HYL. Transcript changes in dcl1 and hyl1 were surprisingly distinct, and dcl3 had little effect on the transcriptome. We identified novel miRNA targets and five mature miRNAs, four of which had high similarity to previously identified mature miRNAs. One miRNA gene encodes two complementary miRNAs that target distinct transcripts. These miRNAs target transcription factors, and a gene with 4-α-glucanotransferase activity, thereby expanding miRNA functionality in metabolism. Finally, we found that genes regulated by miRNAs show strong tissue-specific patterns of expression. miRNA regulated genes tend to be highly expressed in the inflorescence, and miRNAs cause the inflorescence transcriptome to both diverge from a meristem and leaf-like state and to acquire a pollen and seed-like state.
We first analyzed ATH1 gene expression data from the inflorescences of the miRNA-defective mutants, dcl1-7, dcl3-1, hen1-1, hyl1-2 and hst15 ,  (Figure 1). The numbers of known, up-regulated miRNA targets and the levels to which known miRNA targets were up-regulated were both poorly correlated with the number of misexpressed transcripts across the canonical miRNA mutants. We found that probe sets homologous to 131 of 225 known miRNA targets are present on the ATH1 microarray, excluding cross hybridizing probe sets. The number of known, upregulated miRNA targets in hst15 (29) was less than the number in other canonical mutants (44 in dcl1; 37 in hyl1; and 38 in hen1; Figure S1), although 1,995 genes are upregulated in the hst15 mutant (Figure 1). The log-fold change distributions of known miRNA targets were also very similar between mutants with large transcriptome changes and mutants with small transcriptome changes (Figure S2). Differences and similarities of transcriptome changes between mutants were also not explained by known miRNA targets, as the number of known miRNA targets shared between two mutants was not correlated with the number of upregulated genes shared between two mutants. Of the 930 up-regulated genes in hyl1-2, 65% of transcripts increased in hst15, 35% in dcl1-7, and 44% in hen1-1 (Figure 2A). Of the 1,995 up-regulated genes in hst15, 30% transcripts increased in hyl1-2, 17% in dcl1-7, and 12% in hen1-1 (Figure 2A). hst and hyl1 shared 18 known miRNA targets, the same number as hst15 and dcl1-7 and less than between hen1 and hyl1 (24) (Figure S3). Likewise, dcl1 and hen1 have similar transcriptomes- of the 1,710 up-regulated genes in hen1-1, 49% of the transcripts increased in dcl1-7 (Figure 2A), and of the 4,238 up-regulated genes in dcl1-7, 20% of the transcripts increased in hen1-1 (Figure 2A). However, the known targets' overlap between dcl1-7 and hen1-1 (22) was about as great as those between dcl1-7 and hyl1-2 (24), and only slightly more than the overlap between dcl1-7 and hst15 (18) (Figure S3).
These analyses also revealed two unexpected results. First, the similarity of transcript abundance changes in hst15 and hyl1-2, and the differences between dcl1 and hyl1 were unexpected given the predicted miRNA biogenesis/transport pathway . The high level of hst15 and hyl1-2 intersection was described above. In contrast, 35% of the transcripts that increased in hyl1 increased in dcl1-7, and 8% of the transcripts that increased in dcl1 also increased in hyl1 (Figure 2A). The down-regulated genes showed similar intersections (Figure 2B). Second, disrupting 21nt miRNA biogenesis had a severe effect on the plant transcriptome; while disrupting long miRNA biogenesis and siRNA biogenesis had a small effect. The number of genes with transcript levels that increased relative to wild type controls ranged from 930 to 4,238 among the canonical 21nt miRNA biogenesis mutants dcl1-7, hen1-1, hst15, and hyl1-2 (Figure 1). The number of transcripts that decreased ranged from 867 to 2,172. In contrast, 258 genes increased in abundance and 210 genes decreased in abundance in the dcl3-1 mutant (Figure 1).
Our results suggested that known miRNA targets fail to explain transcription patterns amongst miRNA mutants, and we devised a method to identify both novel miRNAs and miRNA targets. The ATH1 array contains approximately 22,750 probe sets representing 23,750 genes. We reasoned that the transcripts that changed within the mutants included mRNAs to which miRNAs bind and cleave and downstream transcripts influenced by these direct targets. To identify novel miRNAs and miRNA targets, we defined a core set of genes up-regulated across the dcl1-7, hen1-1, hyl1-2 and hst mutants. This set contains 117 genes (123 probe sets changed significantly across all mutants (Figure 2A), but six probe sets cross-hybridized with multiple genes and were removed).
Transcription factors are over-represented among known miRNA targets . To infer if our core set likely contained target genes, we evaluated the frequency of GO SLIM gene ontology terms within the set. Transcription factors occurred twice as frequently as expected (12% vs. 6%), (Figure S4; Table 1; Fisher's Exact Test P<0.001), and the biological process “transcription” was over-represented within the up-regulated genes (6% vs. 4%, Table S1). Interestingly, genes classified in the “response to abiotic or biotic stimulus” biological process were also more frequent within the up-regulated genes as compared to the non-upregulated genes (6% vs. 3%, Table S1), and genes with unknown molecular functions occurred at less than half the expected frequency among up-regulated genes (11% vs. 24%, Table 1). Genes without GO SLIM terms were not inherently less likely to change in abundance than genes with GO SLIM terms, because the number of genes with unknown molecular functions among down-regulated genes is similar to their genome proportion (Table 1). Among the 105 genes significantly down-regulated across the mutants, there was no evidence that molecular function classes or biological processes were over or under represented (Table 1; Table S1; Fisher's Exact Test P>0.01).
The core set of 117 genes up-regulated across all four canonical miRNA biogenesis mutants represented a conservative starting point to identify novel miRNAs and targets. We searched for high complementarity between a set of 1,984 sRNA sequences generated by deep sequencing of sRNAs from the A. thaliana rdr2-1 mutant  and 158 predicted mRNAs derived from the 117 up-regulated genes, allowing for three or fewer mismatches. We identified 113 sRNAs complementary to 76 putative mRNAs that were derived from 56 of the 117 genes. We ran a BLAST analysis of the 113 sRNAs against 2,585 conserved Arabidopsis 20mers that have homology to conserved genomic loci that could generate a hairpin structure (AtSet3) . Forty of the 113 sRNAs were perfectly matched by 20mers in AtSet3. Twenty-nine were previously characterized miRNAs. From the remaining eleven sRNAs, we identified five novel, mature miRNA sequences. Three of the five mature miRNAs (miRNAs 3, 4, and 5) are encoded by two MIR169 genes on chromosomes 3 and 5 (Table 2). MiRNA 3 is novel. The other mature miRNAs differ from the mature miRNAs in miRBase entries MIR166a, MIR169a, MIR169b, MIR169f and MIR169g by between one or two nucleotides. All candidate miRNAs were expressed in the rdr2 mutant and were not present in sequenced sRNAs from the dcl1 mutant (Table 3). Only miRNA 2 was found in wild type. These miRNAs have low abundance (Table 3), but the abundance is not unusual relative to other known miRNAs. Lu et al. found that 65% (30 out of 46) of known miRNAs have a count equal or less than six out of 4,573 sRNAs sequenced in a 454 analysis of the rdr2 mutant inflorescence .
Pre-miRNA secondary structures were predicted by RNAfold  and evaluated by MIRcheck . The predicted secondary structures are shown in Figure S5. Interestingly, miRNAs 3 and 4 have homology to complementary sequences in the same stem loop of MIR169a (Figure 3). One miRNA has high complementarity with the disproportionating enzyme (DPE2; At2g40840), which has 4-α-glucanotransferase activity and is an essential component of the pathway from starch to sucrose in leaf cells . At2g40840 is relatively highly abundant in leaves, meristems and developing siliques and less abundant in pollen, seeds, and siliques in late developmental stages. The other miRNA has high complementarity with two CCAAT-binding transcription factors (At1g17590, At1g54160). The mature miRNAs 1, 2, 4, and 5 have high complementarity to transcription factor encoding mRNAs targeted by MIR166 and MIR169 miRNAs ,  (Table 4); however mature miRNAs 2, 4 and 5 have predicted novel target cleavage sites within these transcripts. We also observed that the CCAAT binding transcription factor At1g54160 is targeted by different mature miRNAs (1, 4, and 5). These findings revealed that one miRNA can regulate multiple target genes, and one gene can be targeted by multiple miRNAs.
We tested if genes mis-expressed in miRNA mutants have tissue-specific patterns of expression. To determine this, we computed the ranks of the genes up-regulated and down-regulated in the canonical miRNA mutant inflorescences across 23 tissues. Tissues included roots, meristems, leaves, flowers, siliques, seeds, and mature pollen . If genes up- or down-regulated across mutants had high or low transcript abundances within a tissue, these genes had high or low ranks within that tissue, respectively. We also ranked the transcripts that did not change in abundance amongst the mutants in each of the 23 tissues to obtain the expected distribution of gene expression ranks within each tissue. Significant differences in rank distributions were tested using the Komolgorov-Smirnov (KS) test (see materials and methods). We found that up-regulated core set genes were significantly abundant in the inflorescence relative to other tissues (Figure 4). This finding suggested that known miRNA targets would also be highly abundant within the inflorescence. Indeed, known targets were highly abundant within the inflorescence relative to other tissues (Figure 4). Interestingly, the genes down-regulated in the mutants were also highly expressed in the inflorescence relative to other tissues (Figure 4).
We also investigated whether miRNAs could cause the transcriptome of one tissue to resemble the transcriptome of another tissue. Using the tissue corrected distributional bias test (see materials and methods), we found that the up-regulated genes in miRNA biogenesis mutant inflorescences were preferentially expressed in wild type leaves and meristems (Figure 5A), and less expressed in pollen and seeds (Figure 5B). The down-regulated genes in miRNA biogenesis mutants were preferentially expressed in wild-type pollen, siliques and seeds (Figure 5B), and less expressed in roots, meristems and leaves (Figure 5A). Thus mutations in miRNA biogenesis genes cause the inflorescence transcriptome to resemble the leaf and meristem transcriptomes. In addition, the mutations make the inflorescence transcriptome less similar to mature pollen and seed transcriptomes.
Across canonical miRNA biogenesis/transport mutants, the number of misexpressed, known miRNA targets did not correlate with the number of transcripts that changed in abundance. In addition, the number of known miRNA targets upregulated between mutants did not correlate with the number of mRNAs upregulated between mutants. Thus, we postulated that transcriptome differences amongst mutants are due to roles of the biogenesis proteins in processes outside of miRNA biogenesis and/or that a number of miRNA targets have not been identified. We examined an intersection of four mutant transcriptomes to identify novel miRNAs and miRNA targets.
Analysis of the miRNA mutant transcriptomes also showed that the pattern of transcriptional changes amongst mutants was not always consistent with the current model for miRNA biogenesis/transport , and that dcl3 had little effect on the transcriptome. Transcripts upregulated in both hyl1 and hst were the same almost two-times more often than the transcripts that changed in other hyl1 and hst comparisons, and a relatively low number of transcripts were shared between dcl1 and hyl1. Of the 930 transcripts upregulated in hyl1, 65% were upregulated in hst15, and of the 1,995 transcripts upregulated in hst15, 30% were found in hyl1 (Figure 2A–2B). In Arabidopsis, RISC loading may occur both in the nucleus and in the cytoplasm . One could envision HYL1 shuttling a subset of mature miRNAs to HST for export while other miRNA duplexes are integrated into the nuclear RISC. However, of the 37 known miRNA targets up-regulated in hyl1 and of the 29 known miRNA targets upregulated in hst, 18 were shared (Figure S3), a number similar to other mutant pairs. As mentioned above, the role of HST as an exportin involved for miRNA transport has not been confirmed , ; we suggest that HYL may interact with HST in the nucleus to facilitate mature miRNA processing. In contrast, we had expected the highest overlap of transcript changes between dcl1 and hyl1 because miRNA targets accumulate less in hyl1 than dcl1 , and HYL1 interacts with DCL1 to promote the precise processing and efficient generation of miRNAs . Only eight percent of dcl1 transcripts overlapped with hyl1, and only 35% of hyl1 transcripts overlapped with dcl1 (Figure 2A). The hypomorphic dcl1 mutant used in this study, dcl1-7, has a mutation in the helicase domain . We considered the possibility that dcl1-7 could still process HYL1 dependent miRNAs. However, dcl1-7 has been shown to fail to process miR163 which is impaired in hyl1 , and the number of shared known miRNA targets among up-regulated genes (24) is higher between dcl1 and hyl1 than between dcl1 and other mutants (Figure S3). The function of HYL1 in miRNA biogenesis is thus likely DCL1-dependent, and the genes affected in the hyl1 mutant that did not overlap with dcl1 likely reflect other roles for this protein. HYL1 was recently shown to bind to short interspersed element (SINE element) RNA and influence a variety of cellular processes . Future work could examine the relationship amongst SINEs and the gene transcript abundance changes in the HYL1 mutant. As expected, the dcl1 and hen1 mutants had a high number of shared transcript changes , . DCL3 generates 23 to 25nt miRNAs from a number of the same miRNA precursors processed by DCL1 and is also necessary to process a number of 23- to 25-nt repeat-associated siRNAs (ra-siRNA) associated with heterochromatin and DNA repeats , . The dcl3 mutant had a small effect on the transcriptome with 258 and 210 transcripts up-regulated and down-regulated, respectively. This result supports the hypothesis that long miRNAs processed by DCL3 are functionally inert , despite the fact they are developmentally regulated and conserved over time , and the result is consistent with the limited role of repeats in Arabidopsis gene regulation .
We hypothesized that a core set of 123 genes up-regulated among all the canonical miRNA mutants (Figure 2) would contain a number of novel miRNA targets. Transcription factors are known preferential, direct targets for miRNAs , and among the core set of up-regulated transcripts, only the “transcription factor activity” GO molecular function term was highly over-represented (Table 1). This result was consistent with the expectation that these genes contained a number of direct miRNA targets. Interestingly, only the “unknown molecular functions” term was significantly under-represented amongst up-regulated genes (Table 1). The small number of upregulated genes with an unknown molecular function suggests that miRNAs regulate genes that have a noticeable mutant phenotype and thus have a high chance of experimental characterization and discovery. We had also anticipated that the 123 genes would contain downstream miRNA targets that were involved in specific molecular functions or biological processes. Because only transcription factors were over-represented, we conclude that miRNA regulated downstream genes have a range of molecular functions and are involved in a number of biological processes.
By integrating a number of lines of evidence, we identified five novel mature miRNAs. Four were highly similar to previously characterized miRNAs. The novel mature miRNAs met criteria for the annotation of plant microRNAs (Table 2) . The mature 20~21 nt miRNAs are expressed (Table 3). As expected, the miRNA precursors contain extensive base-pairing, stable and conserved stem-loop structures. No asymmetric bulges exist between the miRNA and the opposite stem-arm in the stem-loop (Figure S5). The novel mature miRNAs have high complementarity with their predicted target sites and meet empirical sequence parameters for miRNAtarget recognition such as no mismatches at position 10 and 11 of the 5′ end of miRNAs and no more than one mismatch at positions 2–12 (Figure S6) . MiRNA 3 targets a novel mRNA. The other four mature miRNAs target mRNAs that are known targets of other miRNAs. Three of these four novel miRNAs direct mRNA cleavage at novel positions (Figure S6). This study identified miRNAs using only inflorescence transcriptome data and conservative criteria: putative targets had to be misexpressed in all mutants, and targets and miRNAs could differ by at most three nucleotides. It is very likely that analyses of individual biogenesis mutants and other biogenesis mutant tissues using the same or more liberal criteria would identify additional novel miRNAs and miRNA targets.
As described above, miRNAs are processed from a miRNA duplex. Mature miRNAs guide RISC to silence target mRNAs, and the complementary miRNA* is gradually degraded , . Interestingly, one miRNA gene encodes miRNAs 3 and 4 from the same pre-miRNA stem loop (Table 2, Figure 3). One miRNA targets At2g40840, an enzyme involved in the starch to sucrose transition. The complement targets CCAAT-binding transcription factors. Xue et al.  suggested that high levels of a miRNA* may suppress the function of a miRNA. Because we identified miRNAs based on the reduction of putative targets, both miRNAs identified here very likely function to cleave target mRNAs. This discovery is similar to previous reports in mammals in which RISC was able to cleave the antisense of let-7 target genes when the miRNA/miRNA* duplex was provided . Both strands of miRNA can also suppress the expression of mouse transcripts .
We found that both up- and down-regulated genes in the inflorescences of miRNA biogenesis mutants were highly expressed in wild-type inflorescences relative to other tissues (Figure 4). This observation suggests that miRNAs as a class spatially restrict target transcripts within tissues and/or affect target transcripts at a specific developmental time rather than eliminate transcripts from whole tissues . Co-expression of a miRNA and its target is consistent with this interpretation. miR171 and its targets SCL6-III and SCL6-IV are highly expressed in inflorescences . In rice, osa-miR160, osa-miR164 and osa-miR172 are co-expressed with their targets in root, leaf, seedling, endosperm and embryo .
Using the distributional bias test, we also found that genes up-regulated in the 21nt miRNA biogenesis mutants' inflorescences were preferentially expressed in leaves and meristems and less expressed in pollen and seeds (Figure 5A). Genes down-regulated within the mutants' inflorescences are significantly abundant in wild type pollen and seed and significantly low in meristems, roots and leaves (Figure 5B). Thus, miRNA mutants cause the inflorescence transcriptome to resemble leaf and meristem transcriptomes and to diverge from pollen and seed transcriptomes. A number of developmental and molecular observations are consistent with this observation. The dcl1 mutant displays defects in embryo development, floral meristem identity, and delayed flowering , and hen1 and hyl1 have similar developmental defects as dcl1 , , . Meristem and auxin related genes accumulate in the hyl1 mutant inflorescences . SE, along with DCL and HYL1 functions in primary microRNA processing . Some siliques from F1 progeny of se and hyl1 contained the abortive seeds because the se hyl1 double mutant is embryonically lethal . Finally, overexpression of an AP2 cDNA with mismatches to miR172 caused enlarged floral meristems and many whorls of petals or staminoid organs . Despite these observations, it was not clear a priori that loss of miRNAs as a class would cause one tissue's transcriptome to resemble that of another because different miRNAtarget interactions are known to have opposite functions. For example, overexpression of miRNA156 in maize prolongs juvenile development while expression of miR172 promotes the transition to the adult phase of growth , . It would be interesting to examine how miRNAs globally alter the transcriptomes in tissues other than the inflorescence.
In conclusion, analysis and integration of large mRNA profiling datasets, sRNA deep sequencing data, and computational predictions of conserved genomic stem-loop structures revealed novel mature miRNAs, novel miRNA targets, and miRNA-regulated developmental changes in gene expression in the model plant Arabidopsis thaliana. miRNAs have been identified in the important crop species maize ,  and rice , . It would be interesting to see if miRNAs in these diverse agriculturally important species also promote reproductive transcriptome changes.
ATH1 microarray raw data (.cel files) from wild type and miRNA-defective mutants, dcl1-7, dcl3-1, hen1-1, hyl1-2 and hst15 were kindly provided by Dr. Carrington's lab. Plant growth conditions, RNA extractions, labeling and hybridization were described by Allen et al., . Arrays were hybridized with cRNA derived from inflorescence RNA. We used data from three arrays from each mutant line and three wild type lines. We normalized data using gcRMA implemented in R . The expression level of each gene was represented by the mean of the log2-transformed, normalized and median polished signal levels across each set of tissue replicates, and significant differences between mutant and wild type were determined using linear models in the Bioconductor LIMMA package with a Bayesian correction for standard error. Differences were deemed statistically significant with a false discovery rate (FDR) of 0.05. We defined core sets of genes as those genes whose transcript abundance was higher or lower across all the canonical miRNA mutants.
To determine the tissue abundance of genes that were misexpressed in miRNA mutants, we obtained Affymetrix ATH1 data described by Schmid et al. . We selected expression data from 23 of 79 tissues excluding similar tissue types to avoid over-representation in statistical analyses. Tissue descriptions and abbreviations are given in Table S2. Each gene's expression levels were ranked over all 23 tissues. We determined if core sets of genes had high ranks or low ranks within a tissue compared to the ranks of a reference group, the genes not included within the core sets. A gene in a core set with a high rank within a tissue meant that the gene's transcript abundance was high in the tissue relative to other tissues.
The Komolgorov-Smirnov (KS) test was used to identify tissues in which core sets of up-regulated or down-regulated genes had significantly high or low abundances. The mutant transcripts were sampled from the inflorescence, and thus a priori may have had high ranks in tissues with transcriptomes more like the inflorescence than other tissues. To avoid this bias, we compared the abundance of miRNA-regulated genes across tissues with genes that have the same expression ranks in the inflorescence. We randomly chose sets of genes which had the same expression ranks as genes in the core sets in the inflorescence 1,000 times. For each gene set, we applied the KS test. The core sets' observed KS p values were compared with the KS p values from the permuted data set to determine the significance of the core set KS value. We named this approach the tissue corrected distributional bias test. A core set had significantly higher or lower transcript levels in the tissue of interest relative to other tissues if the P-value was less than 0.01.
Computational analyses were performed on a high performance computing “cluster of clusters”, the Shared Hierarchical Academic Research Computing NETwork (SHARCNET). We utilized the cornfish, narwhal, and whale clusters each with 14, 1,074 and 3,082 processors respectively. SHARCNET enabled rapid computations and the execution of some analyses that required large memory configurations (up to 16GB per processor).
Predicted A. thaliana genomic and mRNA sequences were retrieved from GenBank RefSeqs (accessions: NC_003070.6, NC_003071.4, NC_003074.5, NC_003075.4, and NC_003076.5). A set of 2,585 Arabidopsis 20mers (AtSet3) which are located in potential hairpin structures typical of miRNA precursors and are conserved in rice was kindly provided . Putative miRNA sequences generated by 454 sequencing were downloaded from NCBI. This data set contained 1,984 unique sRNAs derived from rdr2 inflorescence tissue which is enriched for miRNAs . We compared transcripts whose abundance increased among mutants to both the sRNAs and the AtSet3 list of sequences to identify direct miRNA targets. First, the 1,984 sRNAs were blasted against the mRNAs of up-regulated genes. Three or fewer mismatches between a sRNA and an mRNA were considered a hit. Second, we determined the presence of the matched sRNAs within AtSet3, allowing only perfect matches between the sRNAs and AtSet 3 sequences. To visualize hairpin structures, ~500 nt genomic sequences centred on each matched sRNA were retrieved from the GenBank RefSeq records. With those genomic sequences, RNAfold implemented in ViennaRNA-1.8.1  was used to predict the secondary structure. The RNAfold-predicted secondary structure was tested for miRNA hairpins and evaluated for a miRNA and miRNA* duplex by MIRcheck.
The frequency of known miRNA targets amongst genes up- and down-regulated in the miRNA biogenesis mutants. Known miRNA targets were highly represented amongst up-regulated genes.
(0.20 MB TIF)
Histograms of log-fold changes of known miRNA target transcripts represented in significantly up-regulated genes among miRNA biogenesis mutants.
(0.22 MB TIF)
Three-way Venn diagrams showing the overlap of known miRNA targets up-regulated within miRNA biogenesis mutants dcl1-7, hen1-1, hyl1-2 and hst15.
(0.22 MB TIF)
A plot of Pearson's Chi-squared test residuals of GOslim molecular function classes for genes up-regulated in canonical miRNA mutants compared to genes that were not up-regulated in the mutants. DRB combined DNA or RNA binding, nucleic acid binding, and nucleotide binding; HA, hydrolase activity; KA, kinase activity; OB, other binding; OEA, other enzyme activity; OMF, other molecular functions; PB, protein binding; RBA, receptor binding or activity; SMA, structural molecule activity; Ta, transferase activity; TA, transporter activity; TFA, transcription factor activity; UMF, unknown molecular functions.
(0.11 MB TIF)
Predicted hairpin structures of the miRNA genes that give rise to five mature miRNAs. All the hairpin structures are predicted by the RNAfold program. Mature miRNA sequences in each family recovered from 454 sRNA sequence data are represented by vertical red bars located either on the 5-prime or the 3-prime arm of the precursor.
(2.00 MB TIF)
Watson-Crick pairing between novel mature miRNAs and predicted miRNA targets. There were no mismatches at positions 10 and 11 from the 5′ end of miRNAs with nucleotides in miRNA targets (red).
(0.28 MB TIF)
Goslim categories in biological process (BP) of differentially expressed genes and non-differentially expressed genes in dcl1-7, hen1-1, hyl1-2 and hst15.
(0.07 MB DOC)
Sample names and descriptions for ATH1 microarray data.
(0.04 MB DOC)
We are appreciative of the many labs whose data we used in our analyses. Mr. Greg Downs greatly assisted in the computational execution of analyses. We thank Dr. David Wolyn and Dr. Joseph Colasanti for critical readings of this manuscript. This work was made possible by the facilities of the Shared Hierarchical Academic Research Computing Network.
Competing Interests: The authors have declared that no competing interests exist.
Funding: This work was funded by the Natural Sciences and Engineering Research Council of Canada, www.nserc-crsng.gc.ca. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.