|Home | About | Journals | Submit | Contact Us | Français|
We consider the problem of what is being optimized in human actions with respect to various aspects of human movements and different motor tasks. From the mathematical point of view this problem consists of finding an unknown objective function given the values at which it reaches its minimum. This problem is called the inverse optimization problem. Until now the main approach to this problems has been the cut-and-try method, which consists of introducing an objective function and checking how it reflects the experimental data. Using this approach, different objective functions have been proposed for the same motor action. In the current paper we focus on inverse optimization problems with additive objective functions and linear constraints. Such problems are typical in human movement science. The problem of muscle (or finger) force sharing is an example. For such problems we obtain sufficient conditions for uniqueness and propose a method for determining the objective functions. To illustrate our method we analyze the problem of force sharing among the fingers in a grasping task. We estimate the objective function from the experimental data and show that it can predict the force-sharing pattern for a vast range of external forces and torques applied to the grasped object. The resulting objective function is quadratic with essentially non-zero linear terms.
The human motor system is redundant with respect to common actions it performs. This means that there usually are numerous ways to achieve a particular motor goal (Bernstein 1967). At the same time, human motor actions usually are well-reproducible. They vary only slightly from trial to trial and from subject to subject. Such consistency may reflect the fact that humans try to perform their actions in most “comfortable” ways, optimizing performance in some sense.
The problem of what is being optimized in human actions was studied for the trajectory formation in reaching movement (Biess et al. 2007; Cruse et al. 1990; Engelbrecht 2001; Flash and Hogan 1985; Plamondon et al. 1993; Tsirakos et al. 1997), writing (Edelman and Flash 1987), walking (Pham et al. 2007); force sharing among the muscles (reviewed in Prilutsky 2000; Prilutsky and Zatsiorsky 2002) in gait (reviewed in Collins 1995), cycling (Prilutsky and Gregory 2000), jumping (Anderson and Pandy 1999), sit-to-stand movement (Kuzelicki et al. 2005), postural control (Kuo and Zajac 1993), and force sharing among fingers in grasping (Pataky et al. 2004).
This problem is usually called the problem of inverse optimization (Ahuja and Orlin 2001), in contrast to the more common direct optimization problem. The latter consists of finding values minimizing a given objective function. Unlike direct optimization problems, in the inverse optimization the objective function is unknown, while the values at which the objective function reaches its minimum are given. The inverse optimization problem is usually considered for a set of different constraints.
Though the inverse optimization problem has been addressed for a vast range of tasks, the approach rarely goes beyond the cut-and-try method. The common way to analyze it is based on an a priori chosen objective function, for which predictions are obtained and compared with experimental data. The choice of the objective function depends on physiological or psychological considerations and often on mathematical elegance. A reasonably good agreement of the prediction with experimental data is usually interpreted as a proof that the chosen objective function is the one that is optimized by the motor system.
The most extensive analysis of the problem has been done for the force sharing among the muscles serving one or several joints (reviewed in Erdemir et al. 2007; Prilutsky and Zatsiorsky 2002). In this case the optimization problem is to find muscle forces Fi such that, taken together, they exert desired moments of force Mj at the joints and minimize some objective function J. The muscles can exert only pulling efforts not exceeding the maximum values . In mathematical terms this problem can be stated as follows:
where k is the number of joints and n is the number of muscles, acting on these joints, n > k, rij is the arm of the ith muscle relatively the j-th joint.
The objective function J is usually of the form:
where are positive weights, which are usually either taken equal to each other or are normalized to some quantity (for example, the physiological cross section area or maximum force). The power p can be any positive number up to plus infinity, in which case the objective function transforms into:
Crowninshield and Brand (1981) suggested an approach for defining the parameters of the objective function (1), which they claimed to be physiologically relevant. In their model the value being minimized is inverse of the fatigue time averaged across muscles. They show that this value is proportional to normalized muscle force raised to the power p between 2.54 and 3.14. Normalization parameters are taken proportional to the physiological cross section area (PCSA).
The above function captures general features of muscle activation patterns in gait (Crowninshield and Brand 1981). It has been even suggested that in skilled movement the muscle forces are shared in a way to minimize this objective function (Prilutsky 2000). However, this hypothesis cannot explain experimentally observed co-activation of antagonist muscles (i.e. muscles making opposite contributions into the moment of force at the same joint) except for biarticular antagonists (Ait-Haddou et al. 2000; Herzog and Binding 1992).
It is unclear whether an optimization function can be unambiguously identified from experimental data. Collins (1995) has shown that the same experimental data can be explained reasonably well with significantly different objective functions. To some extent, this fact may be due to the imprecision of muscle force estimation from the electromyographic signal and uncertainties in the moment arms, which have been shown to influence the result of optimization significantly (Herzog 1992; Raikova and Prilutsky 2001; Redl et al. 2007).
These uncertainties are less dramatic in grasping tasks, where forces and points of their application can be directly measured (Zatsiorsky et al. 2002; Zatsiorsky and Latash 2008). When the grasp orientation was vertical and the subjects had to maintain the hand-held objects in the air at equilibrium resisting object weight and external torque an optimization was performed using as criteria the cubic norms of (a) finger forces, (b) finger forces normalized with respect to the maximal forces measured in single-finger tasks, (c) finger forces normalized with respect to the maximal forces measured in a four-finger task, and (d) finger forces normalized with respect to the maximal moments that can be generated by the fingers. All four criteria failed to predict antagonist finger moments (moments exerted by individual fingers that assisted rather than resisted external torques) at large external torques. Note that the above criteria did not take into consideration the finger interdependence (‘enslaving’). To account for the finger enslaving the vectors of “neural commands” were reconstructed from the finger forces using the enslaving matrix (Zatsiorsky et al. 2002). Optimization of the neural commands resulted in the best correspondence between actual and predicted finger forces; in particular the antagonist moments were predicted. However, when the grasp orientation was not vertical all the above mentioned objective functions explained the experimental data with approximately similar accuracy (Pataky et al. 2004).
In this paper we show that, even if ideally precise experimental data are given, the inverse optimization problem may have infinitely many solutions. Consider a simple mental experiment. Assume that the subject is instructed to exert a given total force with the four digits:
In the experiment the total force is varied within some range. Assume that the subject performs the task perfectly and there are no errors in data recording. For each value of the total force the subject chooses a pattern of sharing the total force among the digits. Let us assume that the total force is shared equally among the fingers:
Now, assume that a researcher guesses an objective function whose optimization should lead to the observed experimental results:
Indeed, one can verify that minimization of (4) subject to the constraints (2) has a unique solution (3). However one can also notice that (4) is by far not the only function with such properties. For example, the objective function
Hence, for this particular example, there exist infinitely many different objective functions, which optimization, subject to (2), results in (3). At the same time, having more observations under other experimental conditions might limit the range of possible objective functions.
The main goal of this paper is to develop a method to determine an unknown objective function from a set of observations. To do that we obtain sufficient conditions that guarantee unique solution of the inverse optimization problem. We focus our analysis on the inverse optimization problem with additive objective function and linear constraints:
where x is an n-dimensional vector, gi unknown scalar functions, C a k × n-matrix and b a k-dimensional vector, k < n.
This formalization is typical of a vast range of problems considered in the human movement science, in particular for various forms of the force sharing problem. A simpler version of this problem was analysed in Siemienski (2006) for the case when the functions gi are proportional to each other and only one linear constraint is present.
The paper has the following structure. In the Preliminaries, we provide general definitions and statements we needed for the analysis of inverse optimization problems. In the Main Results, we focus on the problem (7), for which we prove the Uniqueness theorem. Then we consider two simple examples of the inverse optimization problem and illustrate how the Uniqueness theorem can be used to solve them. In the Applications, we analyse a “real-life” example of force sharing in grasping. We used our theoretical results to plan an experiment and to determine the objective function from the experimental data. The results are discussed in the Discussion section, followed by Appendix, which contains the proofs of the statements given in the paper.
Estimating the objective function from observations does not necessary lead to a unique solution. Indeed, there are some transformations of the objective function that do not influence the solution of the optimization problem. Among those are multiplication of the objective function by a positive number or adding an arbitrary number to the objective function. As a consequence, the inverse optimization problem can never be solved uniquely unless some additional information on the objective function is given. We call two objective functions J1 : X → and J2 : X → essentially different on a subset X n if there exist constraints such that the problems J1, and J2, have different solutions. Otherwise we call the objective functions essentially similar on X.
Optimization of essentially similar objective functions under any constraints leads to the same result. Therefore the inverse optimization problem can be solved up to the class of essentially similar functions only. It must be noted that the class of essentially similar objective functions is rather vast. For example, the objective functions J (x) and f (J (x)) are essentially similar for any strictly increasing function f.
In some cases, minimization of the objective function can be performed independently for some subsets of variables. This fact may limit the possibility of inverse optimization. Consider a simple example:
It is evident, that x1, x2, x3, x4 guarantee minimization of J → min subject to constraints if and only if x1, x2 minimize subject to x1 + x2 = a and x3, x4 minimize subject to x3 + x4 = b. In these two problems the functions J1 and J2 can be replaced with any essentially similar objective functions 1 = 1(x1, x2) and 2 = 2(x3, x4). Then all x1, …, x4 that minimize subject to constraints x1 + x2 = a and x3 + x4 = b will also minimize the objective function = 1 + 2 subject to the same constraints. However, the objective functions J and are essentially different. Thus, there are infinitely many essentially different objective functions, which are minimized by the same values x1, …, x4 under constraints x1 + x2 = a, x3 + x4 = b. This fact may lead to non-uniqueness in solving the inverse optimization problem.
We now define a splittable optimization problem. To do that we first introduce the notion of groups of variables independent with respect to the optimization problem. Let the objective function J (x) be minimized subject to the constraints (x), where x = (x1, …, xn)T. Assume that the elements of x are split into two groups x1 and x2. The initial optimization problem converts into the following: minimize (x1, x2) = J(x) subject to (x1, x2) = (x). Now, consider the optimization problems: subject to (x1, 2); and similarly for .
The variables x1 and x2 are said to be independent for the optimization problem J, if the solution of the problem does not depend on 2 and similarly for the problem . In particular, if there is just one point 2 satisfying 2, the variables x1 and x2 are independent.
We call the optimization problem J, splittable if it has independent variables. Consider the following example:
where x1 and x2 are composed of components of x with indexes 1 and 2 respectively, C is a k × n-matrix (k < n), rank C = k, b is a k-dimensional vector.
The variables x1 and x2 are independent for the regarded optimization problem if and only if there is a matrix D, det D ≠ 0, such that in every row of the matrix DC all elements, either with indices 1 or 2, equal zero. The proof of this statement is given in Appendix.
We call an objective function additive if it is essentially similar to an objective function that can be written as follows
Assume that the additive objective function (9) is minimized subject to linear constraints:
where C is a k × n matrix, k < n, rank C = k, b is a k-dimensional vector. Then the problem is splittable if and only if there is a k × k-matrix D, det D ≠ 0, such that by reordering the rows one can make the matrix DC block-diagonal.
Thus, if an additive objective functions is minimized under linear constraints, then the question of whether the corresponding optimization problem is splittable depends only on the properties of the constraint matrix C. For this reason we call a full-ranked matrix C splittable if it satisfies the above mentioned conditions.
The matrix C is splittable if and only if one can make the matrix = CT (CCT)−1 C block-diagonal by reordering the rows and columns with the same indexes (here and on CT denotes the transpose of the matrix C). Indeed, if C is block-diagonal then so is . On the other hand, it is easy to show that multiplying the matrix C by the matrix results in a block-diagonal matrix. Since the matrix has rank k, it has k linearly independent rows. It can be shown that, if the matrix D is composed of these rows, then the matrix DC has block-diagonal structure. Therefore, if the matrix is block diagonal, then so is C.
The main goal of the current study is to find sufficient conditions for the uniqueness of solutions of an inverse optimization problem with additive objective function and linear constraints:
where C is a k × n matrix, rank C = k, and b is a k-dimensional vector.
The formulas (10), (11) define a class of direct optimization problems parametrized with b B, where B is a domain of k. Here and on we assume that every direct optimization problem has a unique solution and that the solutions are known for all b B. The set of the solutions will be denoted by X*.
Given the set X*, the inverse optimization problem consists of finding functions gi, i = 1, …, n defined on some subset of X.
The optimization problem (10), (11) imposes some strong requirements on the functions gi that come from the Lagrange minimum principle, which must hold for every x X* and the corresponding parameter b. The requirements are summarized in the following lemma, which can be thought of as the Lagrange principle for the inverse optimization problem.
Lemma 1 If the functions gi (·) in (10) are continuously differentiable then they satisfy the equation:
and (prime symbol denotes derivative) and I is the n × n unit matrix.
Theorem 1 Assume that the inverse optimization problem (10), (11) with k ≥ 2 is non-splittable. If the functions gi (·) in (10) are twice continuously differentiable and there exist twice continuously differentiable functions fi such that is not identically constant and Č f ′(x) = 0 for all x X*, and Č is defined by (13), then
where the constants qi satisfy the equation Čq = 0, q = (q1, …, qn)T, consti are arbitrary scalar numbers and r is a non-zero constant value.
This theorem provides sufficient conditions for existence and uniqueness, up to linear terms qi xi, of solutions of the inverse optimization problem. This means that if one could find such fi that minimization of the objective function
subject to the constraints (11) for all b B results in the set X*, then the desired objective function J is essentially similar to up to unknown linear terms qi xi.
The values qi satisfy
where is the ith column of the matrix CT, p = (p1, …, pk)T and pi are arbitrary numbers.
Thus, given the constraints (11) the expression qT x does not depend on x. Therefore, adding these linear terms to the objective function does not influence the solution of the optimization problem as long as the constraints (11) are present.
An important manifestation of Theorem 1 is that it provides unique solution of the inverse optimization problem. We first consider the case when k = n − 1. We call such inverse optimization problem elementary. In this case the rank of matrix Č is equal to one and, consequently, the vector equation Č f′ (x) = 0 is equivalent to the following scalar equation:
where a = (a1, …, an) is any row of the matrix Č. Notice that since the problem is non-splittable, the coefficients ai are nonzero (see the proof of Theorem 1).
At the same time, X* is an (n − 1)-dimensional smooth hypersurface in the n-dimensional space, which can be defined by a single scalar equation. Now the problem of the inverse optimization consists of finding a collection of functions h1(x1), …, hn(xn) such that the equation
defines the hypersurface X*. Indeed, if such functions are known then
Here r = 1 and the sign of r determines whether the objective function J reaches minimum or maximum on the given set X*.
Now consider the case of an arbitrary k. Let x1 and x2 be two groups of variables such that x1 k+1. An optimal solution x* of the problem (10), (11) corresponds to some values of these variables: x1* and x2*. Consider the following optimization problem, obtained from the initial one by adding the constraint x2 = x2*:
Here 1 and 2 are sets of indices of x, corresponding to x1 and x2; the matrices C1 and C2 are composed of the columns of C with indices 1 and 2 respectively.
Evidently, if x1 is a solution of (15) then x1 = x1*. One can define the set X1* of all x1*, which is the projection of X* onto axes 1.
The problem (15), (16) is defined for a (k + 1)-dimensional variable x1 subjected to k constraints, i.e. it is an elementary inverse optimization problem. If it is non-splittable, then the procedure described above is applicable and allows one to estimate functions gi (xi) for i 1. Since the choice of indices 1 was to some extent arbitrary, all functions gi can be determined in this way.
If the inverse optimization problem is splittable, it should be split until we reach a non-splittable subproblem. We thus proved that, if the initial problem (10), (11) is non-splittable, then for every i there exists a non-splittable elementary subproblem with at least two constraints.
It must be noted that to apply the proposed method the researcher must assume that the experimentally observed hypersurface is composed of solutions of the optimization problem with an additive objective function and known linear constraints. However, this assumption may not actually hold as for a given set of constraints there may be no additive objective function, which minimization results in the given hypersurface of solutions. In this case the method would result in a non-feasible objective function, which, for example, does not reach either its minimum or maximum on the observed surface, but instead has only hyperbolic points on it.
We illustrate our method by four simple inverse optimization problems.
Consider an inverse optimization problem with an additive objective function of three variables:
subject to the constraints:
The solution of the direct optimization problem is known for a range of b1 and b2:
The constraints (18) are given by the matrix C:
Note that the matrix C is non-splittable. Indeed, the matrix Č = I − CT (CCT)−1 C (see (13)):
cannot be made block-diagonal by reordering the rows and columns with the same indices.
Now we wish to find twice continuously differentiable functions f1, f2, f3 satisfying the equation:
for every x1, x2, x3 from (19). Since the matrix has rank one, the latter is equivalent to the following scalar equation:
Solution (19) determines a plane in the 3-dimensional space:
Then, according to Theorem 1, the functions gi are equal to:
where r is an arbitrary non-zero scalar. The values qi can be represented as
Since determining the objective function can only be performed up to the class of essentially similar functions, the constants in (23) can be set zero. The parameter r can be taken equal ± 1. Since we want the resulting objective function to be minimized rather then maximized on (21), the parameter r must be positive.
The desired objective function is essentially similar to:
where p1 and p2 are arbitrary scalar numbers.
Consider an inverse optimization problem with an additive objective function of four variables:
subject to the constraints:
Let the solutions of the direct optimization problems for a range of b1, b2, b3 lie within a 3-dimensional subspace with the normal vector u = (1, −2, 1, 0)T.
The first step is to verify that the problem is non-splittable. The matrix C for the constraints (26) is
The matrix Č = I − CT (CCT)−1 C is
and is block-diagonal, hence the matrix C is splittable.
The problem splits into two subproblems. The first one is:
Its solution for a range of b1, b2, b3 lies in a plane in the 3-dimensional space, which can be obtained as the result of projecting the solution of the initial problem on the space of variables x1, x2, x3. This plane passes through zero and is orthogonal to the vector ũ = (1, −2, 1)T.
The second problem is vacuous, since x4 can be unambiguously determined from the constraints:
and, thus, g4 can be an arbitrary function.
In this case, the best one can do is to estimate the functions g1, g2, g3. The constraints in (28) are defined by the matrix :
The plane of the solutions of the direct problem (28) can be defined by the equation:
In the previous examples the estimated objective function is quadratic reflecting the fact that the surface of solutions is planar. Here we illustrate how to use our method to analyze non-polynomial objective functions. Consider the inverse optimization problem from Example 1 with additive objective function (17) and linear constraints (18). Assume that the variation of parameters b1 and b2 results in a surface, defined by the equation:
Then according to Theorem 1
where r is a non-zero scalar whose sign determines weather the optimization problem consists in minimization or maximization of the objective function, qi are arbitrary scalars satisfying the equation q1 − 2q2 + q3 = 0.
The objective function J is essentially similar to the following:
where p1 and p2 are arbitrary scalar numbers.
In the previous example the method resulted in estimation of the feasible objective functions. Here we give an example of a hypersurface not resulting in minimization of any additive objective function. Consider the inverse optimization problem from Example 1 with additive objective function (17) and linear constraints (18). Assume that the variation of parameters b1 and b2 results in a surface, defined by the equation:
Then according to Theorem 1
where r is a non-zero scalar, qi are arbitrary scalars satisfying the equation q1 − 2q2 + q3 = 0.
The estimated objective function J is the following:
where p1 and p2 are arbitrary scalar numbers.
Evidently, that for any non-zero r, the estimated function does not reach it’s minimum subject to constraints (18) at any point of 3. Thus, it proves falsity of the hypothesis that the hypersurface (32) results from minimization of an additive objective function subject to constraints (18).
To illustrate applicability of the approach presented above to “real-life” tasks we analyze the problem of force sharing among the digits in prismatic grasping when a subject holds a handle similarly to holding a glass with liquid. The points of application of the thumb and finger forces are assumed to lie in the grasp plane, which is parallel to the longitudinal handle axis (see Fig. 1). An external force Fl parallel to the handle axis and an external torque T orthogonal to the task plane act on the handle.
In the planar case, the static equilibrium constraints include two equations on the forces and one equation on the moments of force. For a vertically oriented handle, the load force Fl must be counterbalanced by the tangential forces of the fingers and the thumb :
The normal force of the thumb must be equal and opposite to the total normal force of the fingers:
The joint moment of the normal and tangential forces must be equal and opposite to the external torque T.
The normal forces must be non-negative and cannot exceed their maximum values:
The tangential forces must stay below the maximum static friction force:
where µi is the coefficient of the static Coulomb’s friction.
The static equilibrium imposes three equality-type constraints (33), (34), (35) on ten force variables: five normal and five tangential forces. Even though they must also satisfy fifteen inequalities (36), (37), in general, the problem is redundant.
In spite of the redundancy the force sharing among the fingers is quite reproducible among trials with fixed load force F and external torque T (Shim et al. 2003; Zatsiorsky et al. 2003). This is especially true for the normal forces. For zero external torque the normal forces are known to scale with the load force (Niu et al. 2007; Westling and Johansson 1984). It is reasonable to assume that a particular force sharing patterns result from minimization of a certain objective function of the normal and tangential forces.
We assume that the force distribution among fingers in grasping results from the minimization of the objective function J for given handle geometry and friction coefficients:
where gi are scalar functions of normal forces and H is a scalar function of tangential forces.
Our goal here is to estimate the objective function involved into the sharing of the normal forces among the fingers, in particular, the functions gi in (38). Indeed, the fact that the constraints are linear and the assumption that the objective function is additive with respect to the normal forces, make it possible to find the functions gi without resorting to the function H. To this end, we consider the following reduced optimization problem:
where Mt stands for the total moment of the tangential forces. The Eq. (33) was omitted because it does not contain normal forces.
According to Theorem 1 one can find the objective function unambiguously, up to some linear terms if a k-dimensional surface of solutions is known, where k is the number of the (equality) constraints. In our case k = 2 and therefore a 2-dimensional surface of the optimal solutions is required.
Changes in the external torque effect only the second equation in (40), while the first one remains unchanged. The load force is not directly present in the constraints (40). Variation of the external torque at a constant load will result in a curve instead of a surface required to apply the method.
To overcome this difficulty we introduce an additional constraint by asking the subject to grip the handle with a given total grip force Fg:
The constraint (41) makes the problem splittable since from (40) and (41) it follows that . Splitting the initial problem produces two subproblems. The first one contains only normal force of the thumb and is vacuous. The second one includes the normal forces of the fingers:
The normal forces of the fingers must also satisfy the (inequality) constraints (36) and (37). However, as it is known from various experiments (Johansson and Westling 1984; Cole and Johansson 1993), normal forces are typically 30–50% above the slipping threshold. If the external load is not very large, neither normal nor tangential forces approach the borders of the domain defined by (36), (37) and therefore these constraints can be omitted. The constraint matrix C for the problem (42), (43) has the form:
The problem (42), (43) has two linear constraints containing two parameters: the external torque T and the grip force Fg, which can be varied independently in the experiment. Thus, we hope that the finger forces F1, F2, F3, F4 obtained for the pairs (T, Fg) will form a surface in the 4-dimensional space of finger forces. As it was shown above, knowing this surface should be sufficient for estimating the additive objective function (42).
We present results obtained for three subjects. They were right-handed young male adults with no history of hand injury (age 27.6±3.0 year, weight 74.7 ± 9.0 kg, height 176.3 ± 9.2 cm, hand length from the middle fingertip to the distal crease of the wrist with hand extended 18.4 ± 0.9 cm, hand width at the MCP level with hand extended 8.9 ± 0.7 cm).
In the experiment, the subjects held a handle mounted with five 6-dimensional force-torque sensors, whose surfaces were covered with sandpaper. The geometry of the handle is presented on Fig. 1. The top of the handle was equipped with an air-bubble level intended to help subjects to hold the handle vertically. A horizontal bar was attached to the handle at the bottom. Suspending various loads at different points along the bar allowed for varying both the load force Fl and external torque T in (43). All combinations of four values of the load force 12.5, 15, 17.5 or 2.0 N, and five values of the external torque −0.4, −0.2, 0.0, 0.2, 0.4 Nm were used in the current study. The total number of combination was 20.
For every combination there were two types of trials, calibration and experimental, lasting 10 s each. In the calibration trials the subjects were instructed to hold the handle naturally trying not to grip it too hard. The total grip force was averaged over the 10-s period of the calibration trial. The averaged value was then used in the experimental trials, in which the subjects were instructed to make the gripping force equal 100, 125, 150 or 175% of that value. The subjects could see the current value of the total normal force of the five digits (gripping force) and the target value on the computer monitor located in front of the subject. Though the load force is not directly present in the Eq. (43), the subjects tended to grip harder in the calibration trials with the greater load force. Thus, by changing the load force we increased the range of the grip force.
On the whole, every subject performed 80 experimental trials. In every trial the average normal finger forces were computed over a 2-second interval where they exhibited the least variation. Following this procedure we obtained 80 points (one point per trial) in the 4-dimensional space of the normal finger forces for every subject. The thumb data are not presented since they are not relevant to the problem (42), (43).
It should be noted that the points of application of the normal finger forces varied across trials and conditions, which in turn resulted in variation of the constraints matrix C. Contrary to the case described above, where C was a constant matrix, we assume that the variations of the matrix C result in small changes of the optimal finger forces as compared to those caused by changing the external torque and the grip force. This assumption allows us to apply our method while keeping in mind that the experimental data points can be scattered around the ideal optimal surface. We used the following values of geometrical parameters of the handle:
The influence of the varied factors (the external torque, the load force and the grip force) on the normal finger forces is illustrated on Fig. 2. One can see that the influence can be rather complex (especially Fig. 2b). However, as mentioned earlier, we expect the experimental values of the normal finger forces to lie on a surface in the 4-dimensional space. To verify this fact, for every subject we plotted 3-dimensional projections of the experimental data on the subspaces (F1, F2, F3), (F2, F3, F4), (F1, F3, F4) and (F1, F2, F4). In the projections the points tended to lie on planes, but were dispersed around them. The planarity of the data in the 4-dimensional space of the finger forces was quantified using the principal component analysis. It showed that 94.4 ± 0.4% of the total variance could be explained by 2 principal components, which define a plane in the 4-dimensional space.
Based on this observation, we made an assumption that the surface of the optimal solutions is a 2-dimensional plane, which can be approximately estimated from the experimental data. The fact that the data points did not ideally form a plane can be explained by variation of the points of the finger forces application, variability of the subject performance and instrumental noise. The plane can be defined by the vector equation:
where is the vector of the normal finger forces, A is a full-ranked 2 × 4-matrix and b is a 2-dimensional vector. The matrix A is composed of transposed vectors of two lesser principle components. The vector b is defined as b = −An, where n is the vector of average finger forces.
According to Theorem 1, if there are functions satisfying Č f′ (Fn) = 0 on the plane (45), then they coincide with gi up to linear terms. Since the experimental points tend to form a plane the following functions can be chosen:
Now, the inverse optimization problem consists of finding coefficients k1, k2, k3, k4 and values w1, w2, w3, w4 for which the plane Č (K Fn + w) = 0 (where K = diag (k1, k2, k3, k4), w = (w1, w2, w3, w4)T) coincides with the plane AFn + b = 0. However, such ki do not always exist. Since both the experimental plane (45) and the matrix C are known imprecisely, it might happen that for a given experimental plane these coefficients cannot be determined, while it can be done for a plane lying very close to the experimental one.
For this reason we searched for the values ki that minimize angle α between the planes Č K Fn = 0 and AFn = 0, assuming k1 = 1 for normalization. This problem was solved numerically. The minimum α equals 2.7±0.5°, confirming that the desired plane is indeed close to the experimental one. Moreover, the plane Č K Fn = 0 gives just 0.2±0.1% less variability of the centered experimental data than the experimental plane (45).
The vector w was chosen to have minimal length. It can be easily shown that such a vector is defined by the formula:
Thus, we found functions fi satisfying the equation Č f′ (Fn) = 0 for all Fn lying on the plane Č (K Fn + w) = 0, which approximates the experimentally observed optimal finger forces. According to Theorem 1, the functions gi in (42) are the following:
Here r is a nonzero number and qi are any numbers satisfying the equation Čq = 0, q = (q1, q2, q3, q4)T.
Since the objective function can be estimated only up to the class of essentially similar functions, one can assume consti = 0 and r = 1. The sign of r is chosen in such a way that the resulting objective function corresponds to minimization problem.
The vector q can be represented as follows:
where p1, p2 are arbitrary numbers, c1 and c2 are columns of the transposed constraints matrix CT. Thus,
The values ki and wi for all subject are given in Table 1. We would like to emphasize that all coefficients ki are positive for all subjects, meaning that the estimated objective function is feasible (it really reaches its minimum on the idealized experimental surface).
The theorem of uniqueness requires the hypersurface of solutions to be known. In this example we dispose only limited set of data points, which, in addition, are subjected to noise. We idealize this data by assuming that it tends to lay on a hyperplane. However, it may happen that in ideal experiment the ideal hypersurface would be slightly different from the hyperplane and consequently, the real objective function would differ from the estimated one. Thus, the estimated objective function represents a quadratic approximation of real objective function. In general, this approximation is as good as the approximation of the experimental data with the hyperplane. To illustrate the quality of the estimated objective function we solved the direct optimization problem with the objective function (49) and the constraints (43). The values in the right side of the constraints equations (43) were computed from the experimental data. Since the solution of the direct optimization problem does not depend on parameters p1 and p2, they were set to be zero. The average errors were computed for every finger as the average absolute difference between the experimentally observed value of the normal force and the one predicted by the optimization problem. The correlation between the predicted and experimental data and the average errors are presented in Fig. 3.
In this procedure we used the same set of data both to estimate the objective function and to illustrate its use in approximating the experimental results. To further validate our method and the assumptions on which it is based, e.g. that the objective function is additive and the plane is an acceptable approximation of the experimental data, we performed additional computational experiments in which estimation and validation sets of data were separated. For each subject we selected 60 random data points and used them to estimate the objective function. Then we solved the forward optimization problem with this objective function and compared the corresponding solutions with the remaining 20 data points. We performed this procedure 50 times for each subject and computed the average errors of the results of the forward optimization. We found that the average errors were only slightly larger (<20%) than those shown in Fig. 3 and did not exceed 0.6N; the coefficients ki were always positive.
In this paper we analyse the problem of inverse optimization, which consists of finding an unknown objective function given the values, at which the function reaches its minimum, for a set of different constraints. This problem often arises when the principles of control of human movements are studied (Engelbrecht 2001). Nowadays, the inverse optimization problem is usually approached with the cut-and-try method. As a consequence, a number of different objective functions have been proposed to explain the control of the same motor task (Collins 1995; Cruse et al. 1990; Pataky et al. 2004).
The attempts to approach the inverse optimization problems more systematically are rather rare. Some theoretical results were obtained in the problem of the correction of a known objective function in linear programming and the theory of combinatorial optimization (Ahuja and Orlin 2001). Recently Bottasso et al. (2006) proposed a systematic approach to identifying unknown parameters of the objective function taking into account possible inaccuracy of the experimental data. Siemienski (2006) proposed an approach to non-parametric identification of an unknown objective function from experimental data for a specific class of additive objective functions, which are minimized subject to a single linear constraint. Siemienski was among the first to emphasize that the inverse optimization problem should be regarded for a set of different constraints.
The purpose of the current study is twofold: to develop a method for non-parametric identification of the objective function in the inverse optimization problem and to obtain sufficient conditions on the set of constraints (and the objective functions themselves) that would guarantee the uniqueness of the solution of the inverse optimization problem. We focus our analysis on the class of the inverse optimization problems with additive objective functions and linear constraints. This class includes various aspects of the force sharing problem, which is one of the most common inverse optimization problems in the science of human motions.
We show that for any value, which minimizes the additive objective function, almost unique identification of the function can be performed. The dimension of the space of such values must be equal to the dimension of the constraints in the problem. From the practical point of view this means that, in order to solve the inverse optimization problem, one should be able to vary independently the values of every constraint. We note that this condition was not met in most studies where an objective function was proposed to optimize various motor tasks. We believe that one of the reasons for a variety of different objective functions in the case of force sharing is insufficient amount of experimental conditions that were used to determine the objective function. The conditions of the Uniqueness theorem proved here can be used to plan the experiments directed to identifying the objective function.
Moreover, we use the Uniqueness theorem to propose a method of solving the inverse optimization problem. To determine the additive objective function of n variables minimized subject to k constraints one can find any n − k independent additive functions, which equal zero on the k-dimensional hypersurface of the experimental data. Of course, in reality one has only a finite number of experimental observations and, therefore, these data should be used to determine the idealized hypersurface. The latter cannot be unambiguous since it represents an attempt to estimate non-parametrized function from limited set of noisy data. Roughly speaking, our method provides as accurate estimate of the objective function as is the idealization of the hypersurface.
In this study we restrict our analysis to the problems with equality constraints only. Usually “real-life” problems have both equality and inequality constraints. The analyzed problem of the finger force sharing in grasping can be considered as an example. Indeed, the normal forces of the fingers cannot be negative or exceed their maximal values while the magnitudes of tangential forces cannot exceed the magnitudes of the normal forces multiplied by the friction coefficient. Nevertheless, the developed method can be applied to this problem because the inequality constraints are “passive”, which means that all experimental points are inside the set defined by the inequality constraints and not on the boundary of this set. In the context of the inverse optimization problem, the passive inequality constraints can be ignored. On the contrary, if all the data points are on the boundary of the constraints, such constraints become “active” and should be treated the same way as any other equality constraints. The case when for some subset of data the inequality constraints are active and for the other they are passive cannot be approached with the proposed method.
Developing our method we assumed that the given experimental data result from the minimization of an additive objective function subject to the known linear constraints. This assumption keeps out of the scope the question of the existence of solutions of the inverse optimization problem. In practical applications the researcher can rarely know for sure that the observed experimental data indeed corresponds to a solution of an optimization problem. In this case he/she can assume the latter and apply the method keeping in mind that the estimated function must be verified to ensure that it really reaches its minima on the observed data.
We illustrate the applicability of our method by analysing a “real-life” example of the force sharing problem in grasping. We demonstrate how the conditions of the Uniqueness theorem can help planning the experiment. We found that, in order to estimate the objective function used for the normal finger forces distribution, it is necessary to vary the external torque applied to the handle and the total grip force of the fingers. We would like to note that in most previous attempts to estimate this objective function the external torque and the load force were varied instead.
We use our method to estimate the objective function from the experimental data. The resulting objective function is quadratic with nonzero linear terms. Polynomial and, particularly, quadratic objective functions have been proposed for force sharing problem, however they did not include linear terms. It must be emphasized here that the fact that the estimated objective function is quadratic follows from the planarity of the surface of experimental data. Moreover, the presence of linear members is a consequence of the fact that the experimental plane does not contain the origin of the reference frame. Since the experimental data was limited and noisy the real objective function may be different from the estimated one. Thus, the result should be treated as a quadratic approximation of the real objective function. In order to illustrate the precision of this approximation we solve the direct optimization problem and compare the solutions with the experimental data. The ability of the objective function to explain the experimental data is illustrated by Fig. 3. In addition, we estimated the objective function on randomly selected subset of data and then validated the estimated function on the remaining data. It appeared that the average performance on the new data was comparable to the one when the same data set was used both for estimation and validation. The latter confirms that at least in the regarded example, the method performs rather robustly and the estimated objective function can be used, of course, with limitations, to predict new experimental data.
We believe that the method we propose can be helpful in analysis of principles underlying the control of human movements. It can be applied to a vast range of problems, especially to various forms of the problem of force sharing. This method provides a mathematical tool for an almost unambiguous identification of the objective function from experimental data, however the question of its interpretation still remains unanswered.
The study was supported in part by NIH grants AR-048563, AG-018751, NS-35032 and NSF grant 0754911. The authors would like to thank James Metzler for the technical support and the reviewers for valuable comments that helped us substantially improve the quality of the manuscript. Our special thanks go to an anonymous journal reviewer who provided Example 4.
We present here the proofs of the theorems and lemmas formulated in the text.
Lemma 2 Consider the following optimization problem:
where the groups of variables x1 and x2 are composed of components of x with indexes 1 and 2 respectively, C is a k × n-matrix (k < n), rank C = k, b is a k-dimensional vector.
Then the groups of variables x1 and x2 are independent for the corresponding optimization problem if and only if there is a matrix D, det D ≠ 0, such that in every row of the matrix DC all elements with either indexes 1 or 2 are equal to zero.
Proof For simplicity we assume that x1 corresponds to the first m components of x and x2 to the remaining n − m components.
Suppose there is a matrix D such that:
Then the constraint (50) splits into two:
Thus, the set of all x1 satisfying (50) does not depend on x2 and vice versa.
Consider a set of objective functions parametrized by 2. All these objective functions are essentially similar and thus are minimized by the same value x1* under all possible constraints. The same holds for . Since in addition to that the constraints for x1 do not depend on 2 and vice versa the groups of variables x1 and x2 are independent.
Now, assume that x1 and x2 are independent. The objective functions are essentially similar for all 2. The same is true for . Thus, the groups of variables x1 and x2 can be independent only if the constraints on x1 and x2 are independent, i.e. the set of all x1 satisfying (50) does not depend on 2 and the same for x2.
Let C1 be the matrix comprised of the first m columns of C and C2 be the matrix comprised of the remaining n − m columns. The Eq. (50) can be rewritten as
The Eq. (51) holds for every x1 S1 ∩ X and x2 S2 ∩ X, where S1 and S2 are some affine subspaces of dimensions k1 and k2 respectively. Since X is an open domain in n and the matrix C has rank k, we have k1 + k2 = k.
Consider the linear space of all rows d2 such that d2C1 = 0. Since the matrix C1 has rank k1, the dimension of this space is k2. Hence there exist a k2 × k-matrix D2, rank D2 = k2, such that D2C1 = 0. Similarly, there exists a k1 × k-matrix D1, rank D1 = k1, such that D1C2 = 0. The matrix
has nonzero determinant since the matrix C has full rank. One can see that the matrix DC is block diagonal.
Proof of Lemma 1 Fix b in (11) and consider the corresponding direct optimization problem. Since the objective function J is differentiable it satisfies the Lagrange minimum principle. The Lagrange function is
where (·, ·) denotes the scalar product, λ = (λ1, …, λk)T is a k-dimensional vector, λ0 ≥ 0 is a number. It can be easily shown that λ0 is strictly positive and, thus, we can assume it to be one.
where Ci is the ith column of C.
In the vector form these equations can be written as
Proof of Theorem 1 I. The case of an elementary optimization problem, k = n − 1.
In this case the matrix Č has rank one and therefore the equation Č g′(x) = 0 is equivalent to the following scalar equation
where a = (a1, …, an) is any row of Č.
The coefficients ai are non-zero. Indeed, if any of them is zero then the matrix Č is block-diagonal (or can be made block-diagonal by reordering the rows and columns with the same indexes). The latter contradicts to the fact that the initial problem is non-splittable.
Let us now prove that if the functions satisfy (54) on X* then they coincide with up to a constant.
Using the Taylor decomposition in a vicinity of a point x X* we obtain that
which holds for every vector dx = (dx1, …, dxn)T in the tangent space to X* at x. Since the dimension of the tangent space is n − 1, it means that the vectors are collinear. Therefore there exists a scalar function r (x) on X* such that
We shall show that r (x) does not depend on x. To this end we express the variable x1 as a function of other variables on the hypersurface X*:
and transfer the Eq. (55) into:
where 1(x2, …, xn) = r (h2(x2, …, xn), x2, …, xn).
The Eq. (56) holds for all x2, …, xn. Since xi, i = 2, …, n can be varied while all xj (j = 2, …, n and j ≠ i) remain constant, the function 1 is constant and hence, so is the function r.
Integrating (55) twice leads to
where qi must satisfy
II. The general case, 2 ≤ k < n − 1.
As it was noted above, for arbitrary n solving the problem of inverse optimization can be reduced to solving a number of elementary subproblems. Thus, to prove the theorem in the general case it suffices to show that for every i there exists a non-splittable elementary subproblem containing gi and that the coefficient r in (57) is the same for all gi.
First we show that if C is a non-splittable k × n-matrix then for every column c of the matrix C there exists a non-splittable s × (s + 1)-minor , rank = s, s ≥ 2 resting on c.
Assume by contradiction that the statement does not hold for some column c of C, that is every s × (s + 1)-minor, s ≥ 2, of C resting on c is splittable. We prove that in this case every s × m-minor, m ≥ s + 1, s ≥ 2, of C resting on c is splittable. We shall use induction over s and m.
The proof is evident for s = 2 and arbitrary m. We prove that if this holds for all s′ × m′-minors, where s′ ≤ s and m′ ≥ s′ + 1, and it holds for (s + 1) × m-minors then it also holds for (s + 1) × (m + 1)-minors. Since the property holds for every s × (s + 1)-minor the latter would prove it for all s × m-minors with m ≥ s + 1.
Let B be an arbitrary s × (m+1)-minor of C resting on the column c. It is splittable according to the induction assumption and thus there exists a matrix D such that:
Here D11 is a s1 × s1-matrix, and respectively, D22 is s2 × s2, B11 is s1 × m1, B22 is s2 × m2, b23 is s2 × 1.
Let B′ be a (s + 1) × (m + 1)-minor such that its first s rows coincide with those of the minor B. Then
Here b31 is 1 × s1, b32 is 1 × s2, b33 is scalar.
Consider (s+1) × m-minor B″ composed of all but the last column of B′. The matrix D′ B″ is splittable and hence, either there exists a row d31 such that d31B11 + b31 = 0 or there exists a row d32 such that d32 B22 + b32 = 0. If the former is true, then B′ is obviously splittable. In particular, it is the case when B11 is a column. Assume the latter is true and B11 consists of at least two columns. Then
Consider an (s + 1) × m-minor B composed of all but the first column of B′. Since B′ is splittable, there exists a row d33 such that d33 B22 = 0 and d33 b23 + d32 b23 + b33 = 0. Thus,
which proves that B′ is splittable.
We proved that any s × m minor of C resting on c is splittable. In particular, itmeans that the matrix C is splittable contradicting to the assumption of the theorem. Hence for every column c of the matrix C there exists a non-splittable s × (s + 1)-minor s ≥ 2.
It can be proved that for every two rows c1 and c2 of the matrix C there exists a non-splittable s × (s + 1)-minor , s ≥ 2, of the matrix C. The proof of this fact is similar to the previous one and is omitted.
Now, we prove the theorem for the general case. Assume there are functions fi (xi), i = 1, …, n satisfying the equation Č f′(x) = 0 on X*.
Consider any function gi. There exists a non-splittable s × (s +1) minor resting on i-th column of C. Let be one of the minors with the largest s. Assume this minor rests on the rows with indexes . Now consider any elementary subproblem with the constraints matrix C′, which includes the columns with indexes . If the matrix C′ does not coincide with the minor then the elementary subproblem is splittable. Split it and consider the subproblem with the constraints matrix including the columns with indexes . Proceeding in this way we come to a non-splittable elementary subproblem with the constraint matrix .
It can be shown that the functions fi satisfy the equation , where ′ consists of the elements of f′ with indexes . For this problem the theorem has been already proven and therefore the Eq. (57) holds for all i .
The same procedure can be performed for all sets of indexes. The scalar r in (57) is the same for all i = 1, …, n since for every i1 and i2 there exists an elementary subproblem, which contains both gi1 and gi2. Obviously, the constants qi must satisfy Čq = 0.
Alexander V. Terekhov, Department of Kinesiology, The Pennsylvania State University, 039 Recreation Building, University Park, PA 16802, USA. Institut des Systèmes Intelligents et de Robotique, CNRS-UPMC, Pyramide ISIR, 4 Place Jussieu, 75005 Paris, France, Email: moc.liamg@vohkeretva.
Yakov B. Pesin, Department of Mathematics, The Pennsylvania State University, University Park, PA 16802, USA.
Xun Niu, Department of Kinesiology, The Pennsylvania State University, 039 Recreation Building, University Park, PA 16802, USA.
Mark L. Latash, Department of Kinesiology, The Pennsylvania State University, 039 Recreation Building, University Park, PA 16802, USA.
Vladimir M. Zatsiorsky, Department of Kinesiology, The Pennsylvania State University, 039 Recreation Building, University Park, PA 16802, USA, Email: ude.usp@1zxv.