The study applied 24 simulated conditions (or 24 permutations) corresponding to two sample sizes, each having four values of CV and three different mean FC values, corresponding to Table . For each permutation, six gene selection methods were used to determine DEGs by comparing the treated group with the control group. These gene selection methods were (1) FC: genes are rank ordered by FC and DEGs determined by a FC cut-off only; (2–3) FC (P < 0.01) and FC (P < 0.05): genes are rank ordered first by FC and DEGs are determined by a P-value cutoff of either 0.01 and 0.05; (4) P: genes are rank ordered by P-value from the simple t-test and DEGs are selected using a specified P-value cutoff; and (5–6) P (FC > 1.4), and P (FC > 2): genes are rank ordered first by P-value and DEGs are then determined by either a FC = 1.4 or FC = 2 cutoff. Each permutation was repeated twice to mimic the process of conducting the same experiment in two different labs or two different platforms. The resulting DEGs from two simulations were compared to assess reproducibility across labs or platforms based on the percentage of overlapping genes (POG).
Figure compares six gene selection methods applied to four datasets, each containing a different noise level (i.e., CV = 2%, 10%, 30% and 100%), where POG is shown as a function of the number of genes selected as differentially expressed between two simulations for the same permutation (magnitude = 1.5 and sample size = 50). In general, the FC-based gene selection methods outperformed the P-based gene selection method in terms of DEG reproducibility measured by POG. Specifically, three FC-based gene selection methods, i.e. FC, FC (P < 0.01), and FC (P < 0.05) consistently result in the highest POG values, regardless of CV value. Higher noise consistently results in lower POG (i.e., DEG reproducibility), as expected. The POG consistently decreases with increasing CV. For P value selection methods, higher FC cutoff results in higher POG. All results are consistent with MAQC observations.
Figure 1 The relationship of POGs with the degree of noise level in the simulated datasets: (A) Low noise (CV = 2%); (B) Medium noise (CV = 10%); (C) High noise (CV = 30%); and (D) Very high noise (CV = 100%). The simulated datasets were set to the expression (more ...)
Figure compares six gene selection methods on three datasets, each having a different magnitude level between the treated and control groups (i.e., FC = 1.5, 0.6 and 0.2). Similar to Figure , the FC-based methods resulted in greater reproducibility compared to the P-based method. Furthermore, POG increases with increasing differential expression magnitude for FC selection methods. However, this trend is not prominent for P value-based selection methods, where it seems that the trend is equivocal.
Figure 2 The relationship of POG with the degree of difference in expression magnitude between the treated versus control groups. (A) Magnitude = 0.6; (B) Magnitude = 1.5; and (C) Magnitude = 0.2. The simulated datasets had CV = 30% and sample size = 50. The x-axis (more ...)
Figure compares six gene selection methods on two datasets, one having sample size of 50 and the other having sample size 5. FC-based methods again give higher POG than P value-based methods, with the larger sample size resulting in higher POG for either selection approach.
Figure 3 The relationship of POG with the sample size: (A) 50 samples/group and (B) 5 samples/group. The simulated datasets had CV = 30% and magnitude = 50 (see Table 1). The x-axis represents the number of genes selected as differentially expressed, and the y-axis (more ...)
Whereas POG are affected by the degree of noise level, expression magnitude and sample size of the datasets, the above results clearly demonstrated that the DEGs become more reproducible, especially when fewer genes are selected, if the FC is included as the ranking criterion for subsequent DEGs identification. It is likely that the discordance of reported microarray results in literature is in large part due to the widespread of using P-based approach to rank genes over the FC-based method. The results of our another related study demonstrated that the relationship of the tradeoff between reproducibility and specificity/sensitivity in the FC (P) approach can be balanced by weighting the FC as a primary consideration in gene ranking: that is an FC criterion explicitly incorporates the measured quantity to ensure reproducibility, whereas a P criterion incorporates control of sensitivity and specificity [14