We chose to test the new optimization algorithms on sets of simulated data. In this way, no explicit modeling errors are included in the computations (however, two types of noise are
included), and also we know a priori approximately where the optimal position xQTL
is located. We have simulated a collection of 115 large data sets, imitating both backcross and intercross populations. The number of QTL, d
, is varied from 2 to 6. In the intercross sets we only introduce marginal QTL effects, ie, effects depending only on the genotype at a single locus. In the backcross sets the major effects come exclusively from pairwise epistatic interactions, ie, they depend on the combined genotypes at pairs of loci. More details about the data are given in the Appendix
We begin by presenting a numerical investigation of the properties of the objective function f (x) in the search space hyper boxes where we perform local optimization. A simple midpoint test of a necessary but not sufficient condition for convexity of f (x) was implemented in the line search algorithm within the D-SD and D-QN methods. If the condition was violated in any iteration for any line search in a hyper-box, that box was marked as nonconvex. The results of this investigation was that, of the boxes containing the global optima for the 115 test problems, in total 65 proved to be nonconvex. Further experiments indicated that in these cases, the function was concave and the minimum was located at the hyper-box boundary. The corresponding result for all boxes where local optimization was applied was that at least 43627 out of the 64739 boxes tested were nonconvex. From these results it would be tempting to draw the conclusion that the gradientbased optimization methods can not be used for the local optimization phase. However, we also performed an investigation of the accuracy of the different schemes. Here, the global optimum was considered to be found if R < 1, where R is the ratio of the current error to the accepted error, ie,
In the experiments we used γ = 2 · 10−4, a choice which is motivated by that the function value at the second smallest local minimum in some cases differ to the global minimum by almost as little as this. The slightly surprising result of the investigation was that all optimization methods, including the D-SD and D-QN schemes, succeeded in finding the global minimum for all the 115 data sets. We cannot give a rigorous explanation for the good result for the gradientbased methods.
Before proceeding to a comparison of the work required for the different algorithms, we consider the criterion used for terminating the optimization algorithms again. As remarked in A two-phase optimization algorithm for the outer problem, we terminate the search for the global optimum when Nf objective function evaluations have been performed without any further improvement in the function value. For a d QTL model, the size of the search space is Gd/d!, where G is the length of the genome in centi-Morgan. This motivates us to set Nf = ( palg·G)d/d!, where the parameters palg are determined by performing a large number of numerical experiments for each algorithm, adjusting palg so that the global optimum is found in all 115 data sets. In , the values of palg·G are shown. When performing the experiments resulting in , we noted that for all algorithms the values of palg were determined by a few “exceptional” data sets. For most sets of data, a much smaller value of palg can be used and the global optimum is still found. We also noted that, in general, the intercross data sets require more function evaluations than the backcross sets.
For QTL analysis problems, the evaluation of the objective function, ie, the solution of the inner problem, completely dominates the CPU time. In and , we compare the maximal number of objective function evaluations required for the different algorithms when solving all of the test problems. The tables show results for different values of d, and also include data for an exhaustive grid search with the resolution needed to locate the optima with the same accuracy as used for the other methods. In and , the same results are shown graphically using a logarithmic scale for the number of function evaluations.
The maximal number of function evaluations for different values of d, all back-cross data sets
The maximal number of function evaluations for different values of d, all intercross data sets
The maximal number of function evaluations for different values of d, all backcross data sets.
The maximal number of function evaluations for different values of d, all D-QN 0.49 0.54 0.41 0.48 0.52 intercross data sets.
From the tables and figures, it is clear that using a twophase algorithm significantly reduces the number of function evaluations required, even when the DIRECT algorithm is used also for the local optimization. It is also clear that if the gradient based methods are employed, this gives a considerable further improvement. Here, the difference between the D-SD and D-QN schemes is not very large. It should also be noted that, as a result of the type of stopping criterion used, the number of function evaluations is dominated by the number of evaluations with no improvement required before terminating the global optimization algorithm. This also results in that there is no significant difference between the performance of the algorithms for the backcross and intercross data sets, even though for most of the backcross problems the global optimum is actually found faster than for most of the intercross problems.
In and , we show the fraction of the total number of objective function evaluations spent in the local phase for the three methods using a local algorithm. From the tables, it is clear that even though the difference in work between using DIRECT for the local optimization compared to using a gradient-based scheme is not dramatic, significantly less time is spent in the local optimization phases when the gradientbased methods are used. Also, for these schemes, the fraction of work in the local phase is less dependent on d.
The fraction of function evaluations in local algorithm, backcross data
Fraction of function evaluations in local algorithm, intercross data
Finally, we study the ability of the forward selection technique mentioned in the Introduction and A class of QTL models to locate the global optima for our test problems. We applied the forward selection procedure to our data sets, using exhaustive grid search for the consecutive one-dimensional optimization problems. In , we show the ratio of maximal actual error, in the minimum found using forward selection, to the accepted error. Recalling that a successful search is defined by R
≤ 1, it is clear from the table that only for the model with d
= 2 and the intercross data sets, the correct optimum is always found. For the backcross data, the method failed already for a model with two QTL. In many cases, the wrong cc-box is identified and the error is 50 times larger than acceptable. When the right cc-box is identified, the error in the position is often still very large. These results are consistent with previous observations for experimental data that forward selection can fail to detect QTL whose main effect is epistatic.9
It is important to note that also for the intercross data set, where there are no interaction effects at all, forward selection apparently can fail to find the correct cc-box. This indicates that it is important to use a true d
-dimensional optimization method as soon as multiple QTL are fitted for a single phenotype, even when no interactions are included in the model. However, it is clear that this type of simple experiment needs to be extended to real data sets and actual QTL analysis problems before any firm conclusions of this type can be drawn.
Results from forward selection, ratio of actual error to accepted error