PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
J Math Sociol. Author manuscript; available in PMC 2011 January 1.
Published in final edited form as:
J Math Sociol. 2010; 34(2): 136–145.
doi:  10.1080/00222500903221571
PMCID: PMC2874990
NIHMSID: NIHMS178451

A Note on Algebraic Solutions to Identification

Abstract

Algebraic methods to establish the identification of structural equation models remains a viable option. However, sometimes it is unclear whether the algebraic solution establishes identification. One example is when there is more than one way to solve for the parameter, but one way leads to a single value and a second way leads to a function with more than one value. This note proves that one explicit and unique solution is sufficient for model identification even when other explicit solutions permit more than one solution. The results are illustrated with an example. The results are useful to attempts to use algebraic means to address model identification.

Keywords: identification, structural equation models, nonlinear simultaneous equations

1 Introduction

Model identification refers to whether it is possible to find unique values of all model parameters from the population moments of the observed variables. Typically, the population moments refer to the variances, covariances, and means of the observed variables, though higher-order moments are sometimes used (e.g., Bentler, 1983). Algebraic solutions are the oldest approach to identification dating back at least to the work of Sewall Wright (1921). Its basis lies in writing each variance, covariance, and mean of the observed variables as a function of the parameters of the model. Then each model parameter is solved for as a function of one or more of these moments of the observed variables. As Long (1983, page 44) notes:1 “In general, the most effective way to demonstrate that a model is identified is to show that through algebraic manipulations of the model’s covariance equations each of the parameters can be solved in terms of the population variances and covariances of the observed variables. This is a necessary and sufficient condition of identification.”

Though a variety of rules of identification have emerged from the econometric (e.g., Fisher, 1966) and the latent variable literatures (e.g., Bollen, 1989, 238-47, 326-32; Davis, 1993), these have not eliminated the need to turn to algebraic methods of identification. First these rules do not cover all models. Second, common empirical checks of identification are based on Wald’s Rank Rule (Wald, 1950) or on checking the singularity of the information matrix (Rothenberg, 1971) and these check local not global identification. Furthermore these local identification checks are based on sample estimates. Due to the lack of rules for all situations and to the limits of local identification, algebraic solutions remain an important approach to establishing the identification of a model or parts of a model where identification is uncertain.2

Ambiguities in the algebraic approach, however, arise when there are multiple ways of solving for a parameter using different moments of the observed variables, as is typically the case with overidentified models. In such situations, it is possible for one solution to yield a single set of parameter values while another solution permits two or more values for at least some of the parameters (e.g., this may arise with solutions involving square roots). In this note we prove that obtaining at least one solution that yields unique parameter values for each parameter is sufficient to establish the global identification of the model. This is important to know in that a researcher solving identification via algebraic means might not know whether a parameter or model is identified if he comes across two or more solutions for the same parameter where at least one of the solutions permits the parameter to take two or more distinct values. We have encountered this problem in experiments with Computer Algebra Systems (CAS) applied to determining the identification of complex structural equation models (SEMs). Indeed, the proofs and this paper grew out of our attempts to determine what to do when faced with this situation and our failure to find any answers to this question in the literature on model identification. However, the result might also be useful in other situations where researchers use algebraic means to solve for parameters when there are more equations than there are parameters.

Our note proceeds as follows. First, we review the identification of SEMs in general terms. Second, we examine four cases involving different types of algebraic solutions for model parameters and provide our proof that obtaining one solution with unique parameter values establishes identification. We conclude with an illustrative model in which we use a CAS algorithm and employ our result to determine model identification. We focus only on the use of the variances, covariances, and means of observed variables and using them to identify model parameters, though our results on the conditions for unique solutions would generalize to the examination of higher-order moments.3

2 Algebraic Solutions

Suppose that we have

σ=F(θ)
(1)

where σ is a vector of variances, covariances, or means of observed random variables, θ is a vector of model parameters, and F(θ) is a vector of functions of θ. The F(θ) takes different forms depending on the specific SEM. Considering the covariance matrix of observed variables in confirmatory factor analysis, for example, the vector of implied covariances, variances and means is F(θ)=(vech[ΛΦΛ+ϴ]α+Λμξ) where Λ is the matrix of factor loadings, Φ is the covariance matrix of the factors, Θ is the covariance matrix of the unique factors, vech is a matrix operation that stacks all of the nonredundant elements in Λ Φ Λ’ + Θ into a vector, α is the vector of intercepts, and μζ is the vector of means of ζ. F(θ) is the model implied moment vector. In general, we assume that the variances, covariances, and means of all variables exist, that all variances in σ and θ are nonnegative and any implicit or explicit correlations of any two variables are less than one in absolute value. As mentioned above, we only make use of the means, variances, and covariances of the observed variables in identifying the model parameters.

To define global identification, consider two vectors θa and θb, each of which contains numeric values for the unknown parameters in θ. For each vector we can form the implied covariances and variances, say σa = F(θa) and σb = F(θb), for each set of numeric values. If the model is identified, all θa and θb solutions where F(θa) = F(θb) must have θa = θb. If a pair of vectors θa and θb exists such that F(θa) = F(θb) and θaθb, then θ is not globally identified. Local identification is a weaker concept of uniqueness. A parameter vector θ is locally identified at a point θa, if in the neighborhood of θa there is no vector θb for which F(θa) = F(θb) unless θa = θb (Bollen, 1989, page 248).

Suppose that we form subsets of the elements of σ such that each subset vector, σj, has a dimension equal to the number of parameters in θ and each element of θ appears at least once in the Fj(θ) that corresponds to σj where Fj(θ) refers to the subvector of F(θ) that corresponds to σj. This leads to

σ1=F1(θ)σ2=F2(θ)σJ=FJ(θ)
(2)

Given that equation (1) is true, each equation in (2) must be true since they are just subsets of the original true equation. Suppose that K of these equations have explicit solutions for θ that are functions of elements of σ. We write these solutions as

θ=G1(σ1)θ=G2(σ2)θ=GK(σK)
(3)

where Gk(σk) is a function of σk that is an explicit solution for θ and where Gk(σk), k = 1, 2, 3, (...) , K represent different functions. Further assume that if there is no superscript (l) that the Gk(σk) function is explicit and unique in that it leads to only one solution. If we have, say Gk(1)(σk), Gk(2)(σk), Gk(L)(σk), then there are L explicit solutions for the given function. For instance, if the explicit solution involves a square root, then we would have the positive and negative square root solutions with L = 2.

We distinguish four cases:

Case 1

Only one explicit solution exists, and it is unique. Without loss of generality, let this solution be given by θ = G1(σ1). In this case the model would be identified since G1(σ1) is the only solution and results in a single solution. This situation is generally encountered when the number of parameters equals the number of variances, covariances, and means of the observed variables. However, having the same number of parameters and number of moments does not guarantee a solution nor that it will be a unique solution.

Case 2

The only explicit solution is θ(l)=G1(l)(σ) and this leads to, say, L possible values of G1(l)(σ) of θ(1)=G1(1)(σ1), θ(2)=G1(2)(σ1),, θ(L)=G1(L)(σ1) where G1(t)(σ1)G1(u)(σ1) which implies that θ(t)θ(u) for all tu. Given that we have L explicit solutions, can we tell whether θ is identified? Consider global identification first. The algebraic solutions of θ(1)=G1(1)(σ1), θ(2)=G1(2)(σ1),, θ(L)=G1(L)(σ1) derive from the original equation of σ = F(θ) which corresponds to the model. This means that if any of these solutions, say θ(s), is substituted in for θ in σ = F(θ), then F(θ(s)) will equal σ. Since θ(t)θ(u), the model parameters cannot be globally identified. So if we have Case 2, the model is not globally identified. We can check local identification with the Wald’s Rank Rule. Form F(θ)θ and check whether its rank equals the number of independent parameters where we assume the differentiability of F(θ) with respect to θ.4 If it does, then the model is locally identified. If its rank is less, then it is not.

Case 3

The θ = G1(σ1) is a unique, explicit solution and we also have θ=G2(1)(σ2) and θ=G2(2)(σ2) where there are two explicit solutions associated with G2(l)(σ2). Given one unique explicit solution, is this sufficient to identify θ?

As we stated above, all equations in (2) are true since they are just subsets of the true equation in (1). The equations in (3) derive from the equations in (2) and hence σ1 = F1(θ) and σ2 = F2(θ) must both be true and the value(s) of θ must satisfy both equations.

There are several possibilities to consider:

  1. θ = G1(σ1) is true, θ=G2(l)(σ2)(l=1,2) is true
  2. θ = G1(σ1) is false, θ=G2(l)(σ2)(l=1,2) is false
  3. θ = G1(σ1) is false, θ=G2(l)(σ2)(l=1,2) is true
  4. θ = G1(σ1) is true, θ=G2(l)(σ2)(l=1,2) is false

Consider the first possibility, that θ = G1(σ) and θ=G2(l)(σ2)(l=1,2) are true. Using proof by contradiction, this implies that

G1(σ1)=G2(l)(σ2)

which cannot be true since G1(σ1) is a single value solution and it cannot equal two different values, G2(1)(σ2) and G2(2)(σ2). Therefore, we dismiss the first possibility as invalid.

The second possibility that θ = G1(σ1) and θ=G2(l)(σ2)(l=1,2) are both false we also rule out by proof of contradiction. The solution θ = G1(σ1) is implied by σ1 = F1(θ). If θ = G1(σ1) is false, then σ1 = F1(θ) is false. But this contradicts our given that σ = F(θ) and hence σ1 = F1(θ) is true. Therefore, possibility 2. cannot be true since θ = G1(σ1) must be true. By the same logic, we can rule out the third possibility since it too assumes that θ = G1(σ1) is false and we just ruled that out.

By process of elimination, possibility four must be true (i.e., θ = G1(σ1) is true, θ=G2(l)(σ2)(l=1,2) is false). The statement that θ=G2(l)(σ2)(l=1,2) is false requires closer examination since this contains two possible values. This could be false is one of three ways:

  1. θ=G2(1)(σ1) is false, θ=G2(2)(σ2)(l=1,2) is false
  2. θ=G2(1)(σ1) is false, θ=G2(2)(σ2)(l=1,2) is true
  3. θ=G2(1)(σ1) is true, θ=G2(2)(σ2)(l=1,2) is false

Using proof by contradiction, we can rule out one since if both solutions are false, this implies that σ2 = F2(θ) is false, but we know that the latter is true. Therefore we are left with possibility 2. or 3. Which of these two is true is determined by whether G1(σ1)=G2(1)(σ2) or G1(σ1)=G2(2)(σ2). As shown above, both of these equalities cannot hold. However, one of them must hold and that determines which of the two solutions, G2(1)(σ2) or G2(2)(σ2) is true. This in turn shows that having one function that leads to a single unique value is sufficient to establish a single value for θ even if a second function leads to a solution with two values.

Case 4

The preceding proof considers only two solution functions (i.e., θ = G1(σ1) and θ=G2(l)(σ2)(l=1,2)). What happens if there are additional functions that have two value solutions? It is easy to show that the choice of the second function is arbitrary and that the above proof holds for any two value solution chosen in conjunction with a single value solution.

What happens if there is a second function that takes more than two values? Besides adding solution values to the second function, the above proof would remain essentially the same.

Therefore, having a unique function with a single solution for θ is sufficient to establish identification even if there are other unique functions that have multiple solutions.

Note that our discussion focuses on a sufficient, but not necessary condition for identification. It is possible to have a situation with several solution functions, each of which has multiple solutions, but to still have the parameter identified (e.g., only one solution is consistent across these solution functions).

3 Illustration

We now turn to an illustration of the utility of our result in assessing the identification of a SEM shown in Figure 1. Our illustrative model contains one exogenous and two endogenous observed variables. We specify a recriprocal relationship between the two endogenous variables, but constrain the parameter estimates for the two paths to be equal.

Figure 1
Model to Demonstrate Identification Result

This model can also be expressed by the following system of equations:

y1=βy2+ζ1y2=βy1+γx1+ζ2.

In this model we have six variances and covariances and five model parameters. We define σ11 = V (y1), σ22 = V (y2), and σ33 = V (x1) and the various covariances represented by the appropriate subscripts. This model leads to the following vector of functions, F(θ):

σ=F(θ)(σ11σ21σ22σ31σ32σ33)=([β2V(ζ2)+β2γ2V(x1)+V(ζ1)][β21]2[βV(ζ1)+βV(ζ2)+βγ2V(x1)][β21]2[β2V(ζ1)+β2γ2V(x1)+V(ζ1)][β21]2[βγV(x1)][β21]2[γV(x1)][β21]2V(x1)).

For this system of equations, if we choose a subset of parameters that includes the equation relating the covariance between the two endogenous variables to the model parameters (σ21), then we obtain a solution for some of the model parameters involving a square root. For example, if we choose the subset (σ11, σ22, σ33, σ21, σ31) we obtain the following two solutions for β:5

β=σ11+σ22+(σ112+2σ11σ22+σ2224σ212)122σ21,β=σ11+σ22(σ112+2σ11σ22+σ2224σ212)122σ21.

If instead we choose a subset of equations that does not include the equation involving the covariance between the two endogenous variables (e.g., (σ11, σ22, σ33, σ31, σ32)), then we find a unique solution for each parameter. Using this subset of equations, we obtain

β=σ31σ32.

As established above, this is sufficient to determine that the model is globally identified.

As an additional check, our result implies that in any given numerical setting one of the solutions for β that we obtained from the first subset should equal the solution obtained from the second subset. We demonstrate this is the case by generating a covariance matrix based on arbitrary numerical values for each of the model parameters, and then checking which of the first two solutions for β is consistent with the second solution. If we let β = 0.5, γ = 2, ϕ = 2, ψ11 = 2, and ψ22 = 3, then we obtain the following covariance matrix (rounded to two digits):

[8.4411.5620.442.675.332.00].

Substituting the covariances into the two solutions for β we find:

β=8.44+20.44+(8.442+2(8.44)(20.44)+20.4424(11.562))122(11.56)=2.00,β=8.44+20.44(8.442+2(8.44)(20.44)+20.4424(11.562))122(11.56)=0.50.

In this case, the second solution for β matches the unique solution from the other subset of equations, β=2.675.33=0.50 (and both, of course, match the value we chose in generating the covariance matrix). Furthermore, in order for this model to be globally identified it must be true that we obtain a different implied covariance matrix when we substitute the solution β = 2 (along with the solutions for the other elements of β) into the full set of equations than the one given above. This substitution generates the following implied covariance matrix:

[8.4011.5020.362.671.332.00],

with the clearest difference being in the σ32 element.

4 Conclusion

Algebraic solutions to establish model identification was an early means of establishing model identification and it remains important in both establishing new rules of identification and in covering situations that do not fall under existing rules. However, an ambiguous situation emerges when there are two or more explicit, distinct solutions for a parameter and when one or more of these solutions permits multiple values such as when the solution involves a square root. This note establishes that if one explicit and unique solution is found for the model parameters, then this is sufficient to establish model identification even when there are other explicit solutions that permit more than one solution to the equation. This result is of particular significance when a CAS is employed to establish the identification of models algebraically that do not conform to the known rules for identification.

Footnotes

1Long (1983) only mentions the variances and covariances of the observed variables. In some models, the means also can play a role.

2Algebraic solutions can also be useful in formulating new rules of identification (e.g., O’Brien (1994).

3Higher-order moments in some situations provide additional information that would aid model identification. However, these higher-order moments are rarely used and we confine ourselves to the typical situation where a researcher only employs the variances and covariances, and sometimes the means of the observed variables to aid model identification.

4A reviewer points out that if θ is discrete, these derivatives would not exist, but that there are cases in which a local identification of θ is well-defined (e.g., when θ is unidimensional and its states admit a total order).

5We do not report the entire vectors G1(1)(σ) or G1(2)(σ) due to considerations of space.

5 References

  • Bentler PM. Simultaneous equation systems as moment structure models. Journal of Econometrics. 1983;22:13–42.
  • Bollen KA. Structural equations with latent variables. Wiley; New York: 1989.
  • Fisher FM. The identification problem in econometrics. McGraw Hill; New York: 1966.
  • Long JS. Confirmatory factor analysis: A preface to LISREL. Sage University Press; Beverly Hills: 1983.
  • O’Brien RM. Identification of simple measurement models with multiple latent variables and correlated errors. Sociological Methodology. 1994;24:137–170.
  • Rothenberg TJ. Identification in parametric models. Econometrica. 1971;39:577–591.
  • Wald A. A note on the identification of econometric relations. In: Koopmans TC, editor. Statistical Inference in Dynamic Economic Models. Wiley; New York: 1950. pp. 238–244.
  • Wright S. Correlation and causation. Journal of Agricultural Research. 1921;20:557–85.