Enter Your Search:Search tips Search criteria Articles Journal titles Advanced

Discrete Appl Math. Author manuscript; available in PMC 2010 August 6.
Published in final edited form as:
Discrete Appl Math. 2009 August 6; 157(15): 3314–3319.
PMCID: PMC2749697
NIHMSID: NIHMS138279

# The inverse problem for certain tree parameters

## Abstract

Let p be a graph parameter that assigns a positive integer value to every graph. The inverse problem for p asks for a graph within a prescribed class (here, we will only be concerned with trees), given the value of p. In this context, it is of interest to know whether such a graph can be found for all or at least almost all integer values of p. We will provide a very general setting for this type of problem over the set of all trees, describe some simple examples and finally consider the interesting parameter “number of subtrees”, where the problem can be reduced to some number-theoretic considerations. Specifically, we will prove that every positive integer, with only 34 exceptions, is the number of subtrees of some tree.

Keywords: inverse problem, graph parameter, trees

## 1 Introduction and a general construction principle

Numerous graph parameters have been introduced and thoroughly studied. Many of them are defined as the number of vertex or edge subsets of some sort. For instance, Klazar [5] describes twelve different tree parameters and analyses them by means of a generating functions approach. Recently, the inverse problem has gained a certain importance: given a graph parameter p that assigns an integer value to every graph within a given family of graphs and an integer n, determine some G such that p(G) = n. This is of interest in combinatorial chemistry, where graphical parameters are used as molecular descriptors. For applications in the design of combinatorial libraries and molecular recognitions, we refer to [7] and the references therein. In this context, acyclic graphs play a major role, and we will also concentrate on trees in this paper.

We will generally consider rooted trees, which will turn out to be useful for the treatment of the various parameters. (Some of the tree parameters are actually defined for rooted trees only). All rooted trees can be constructed recursively by means of the following two operations:

1. Attach a new vertex to the root, which also serves as the root of the new tree (see Fig. 1)
Attaching a new vertex to the root
2. Merge two rooted trees by identifying their roots (see Fig. 2)
Merging two rooted trees

Many tree parameters can be described recursively in a natural way by means of these two operations. Let us present a few examples.

1. The number of (nonempty) subtrees has been studied in various papers, see for instance [11]; in order to obtain a recursion, we distinguish between subtrees containing the root and those not containing the root. The respective numbers form a pair (a, b) associated with a tree. If we perform Operation (I) now, the number of subtrees of the former type is 1 + the number of subtrees containing the root in the old tree (1 for the subtree that consists only of the root). On the other hand, a subtree that does not contain the root can be any subtree of the old tree. This means that Operation (I) maps (a, b) to (a + 1, a + b). Similar reasoning shows that if Operation (II) is applied to two trees with associated pairs (a1, b1) and (a2, b2), one gets the new pair (a1a2, b1 + b2).
2. The number of matchings (independent edge subsets, including the empty set) is a popular parameter in combinatorial chemistry (where it is known as “Hosoya index”, see for instance [12]) and statistical physics (so-called dimer-monomer model [4]), and of course it also has interesting mathematical properties. Again, a recursive characterization is possible. For this purpose, we distinguish between matchings which cover the root and those which do not, and we associate a pair (a, b) to each tree again. Then, Operation (I) corresponds to the map
whereas Operation (II) maps the pair (a1, b1), (a2, b2) to (a1b2+b1a2, b1b2).
3. The number of independent (vertex) subsets (including the empty set again) has similar properties to the number of matchings and is also a popular chemical parameter (the “Merrifield-Simmons index” [9,12]). Yet again it is useful to distinguish two cases: we associate a pair (a, b) to each tree, where a and b are the number of independent subsets containing resp. not containing the root. Then, the effects of the two operations are
(note that this is the same as for the number of matchings!) and
4. The number of maximal independent subsets, see for instance [16], needs three different auxiliary parameters: A triple (a, b, c) is associated to each tree, where a is the number of maximal independent sets containing the root, b is the number of maximal independent sets not containing the root, and c is the number of independent sets which are not maximal but become maximal if the root (that is necessarily not contained in the set) is removed from the tree. Similar reasoning as in the previous examples yields the two maps
and
corresponding to the two operations.
5. The number of dominating subsets satisfies a very similar recursion: again, three auxiliary parameters are necessary, representing the number of dominating sets which contain the root, those which do not contain the root, and those which dominate all vertices except for the root. Then, our operations induce the two maps
and
on triples associated to trees.
6. The Wiener index is one of the most popular graph-theoretical parameters, see for instance the survey paper [1]. It is defined as the sum of all the distances between unordered pairs of vertices. As shown in [2], it also satisfies certain recurrences: this time, we associate a triple (a, b, c) consisting of the number of vertices, the sum of all distances from the root, and the Wiener index itself, to a tree. It is not difficult to show that this yields the two maps
and
corresponding to Operation (I) and Operation (II) respectively. In order to obtain the last part of this formula, for instance, note that c1 takes account of all pairs of vertices belonging to the first of the two trees that are merged, c2 takes account of all pairs of vertices that belong to the second tree, and that b1(a2 − 1) + b2(a1 − 1) takes account of the remaining (a1 − 1)(a2 − 1) pairs: for each of these pairs, the unique path is the union of the respective paths to the root. The path from a vertex in the first tree (other than the root) to the root is counted exactly a2 − 1 times (once for every non-root vertex in the second tree), which gives rise to the summand b1(a2 − 1). The second summand is obtained in an analogous way.
7. The number of chains [5] is a parameter that only makes sense for rooted trees: a rooted tree can be regarded as a partially ordered set, where uv holds for two vertices u and v iff u lies on the unique path between v and the root. A chain is a nonempty set of pairwise comparable vertices. The number of chains in a rooted tree satisfies very simple recursions with respect to our two operations—no auxiliary parameters are necessary. Operation (I) induces the map
(the new root can be added to any chain and also to the empty set), Operation (II) yields the map
(a chain in the new tree is a chain in one of the old trees, and the chain that only includes the root is counted twice).
8. Finally, we consider the analogous parameter “number of antichains” [5]: an antichain is a nonempty set of mutually incomparable vertices. Again, no auxiliary parameters are necessary, and we obtain the two maps
(under Operation (I)) and
(under Operation (II)).

Note that all these recursive characterizations share a common scheme: in each case, a k-tuple (c1, c2, …, ck) is associated with each tree, the parameter we are actually interested in is some polynomial of the ci (typically linear), and our two operations induce polynomial maps f: kRk and g: k × kk.

## 2 Examples for the inverse problem over the set of trees

### 2.1 Number of matchings

For this parameter, the problem is essentially trivial, as has also been noticed in [7]: every integer is the number of matchings of some star, since a star with n vertices has exactly n matchings (the empty set and n − 1 single edges). Nevertheless, the problem can become much harder if one imposes further restrictions. For instance, it is not quite clear whether almost every positive integer is the number of matchings of some binary tree (at least, we are not aware of any result in this direction).

### 2.2 Number of antichains

This example is also quite trivial, since it is easily observed that the number of antichains of a path, rooted at one of its ends, is just the number of its vertices (the only antichains are single vertices in this case).

### 2.3 Number of chains

Another simple example, but it shows a new phenomenon: it is easy to check from the recursive definition that the number of chains in any rooted tree is always odd. On the other hand, the inverse problem can be solved for all odd integers: indeed, one finds that a star with n vertices, rooted at its center, has exactly 2n − 1 chains (note that all chains are either single vertices or pairs consisting of the root and any leaf), proving the claim.

### 2.4 Wiener index

This problem is somewhat harder than the previous ones. In their paper [6], Lepović and Gutman conjecture that every positive integer, with only 49 explicit exceptions, is the Wiener index of some tree. This conjecture was verified independently in [13] and [15] and further extended to the class of trees with maximum degree 3 in [14]. It should be noted that the Wiener index W (T) of a tree T with n vertices can be bounded above and below as follows:

$(n−1)2≤W(T)≤(n+13).$

This implies that the minimum does not increase linearly any more, which explains why no simple construction as in the previous examples can be given. However, since the growth is only polynomial, the problem can be reduced to number-theoretic considerations similar to those presented in the following section.

### 2.5 Number of independent subsets

Here, the situation is much harder. It is conjectured, yet unproven that almost every positive integer is the number of independent subsets of some tree. This is a question that is also of relevance in combinatorial chemistry: [7] provides an algorithm for constructing such a tree provided that one exists. The biggest difference to the previously considered parameters lies in the fact that the minimum over all trees of size n grows exponentially with n, making it less amenable to the Diophantine methods applied in [14] and also in this paper. However, Linek [8] gives a partial result in this direction: he shows that every positive integer is the number of independent sets of some bipartite graph.

### 2.6 Number of subtrees

This problem will form the main part of the paper. “On average”, the behavior of this parameter is similar to the number of independent subsets, but the minimum only grows polynomially, which allows us the construction of a reasonably rich set of trees for which the inverse problem can be reduced to a question of elementary number theory. Specifically, we will prove the following theorem in the next section:

#### Theorem 1

All positive integers, except for the 34 numbers 2, 4, 5, 7, 8, 9, 12, 13, 14, 16, 18, 19, 22, 23, 26, 27, 29, 31, 33, 35, 38, 39, 42, 43, 46, 50, 52, 54, 60, 65, 68, 72, 77 and 89, can be expressed as the number of subtrees of some tree.

## 3 Proof of Theorem 1

We will show that it is sufficient to consider caterpillar trees, see Fig. 3. These can be constructed recursively by means of our two operations, where we only use a particular special case of Operation (II). In each step, we attach exactly one new vertex to the root; there are two possibilities for doing so, namely

The caterpillar tree C(1, 0, 3, 1, 0, 0, 2). The starting root was the leftmost vertex, while the last root is the rightmost vertex.
• The attached vertex becomes the new root—this is exactly our Operation (I).
• The old root is also the root of the new tree—this is a special case of Operation (II) (merge with a two-vertex tree).

These operations have the following effects on pairs (a, b) as described in the previous section:

To construct caterpillars, we proceed as follows: start with a single vertex. Now apply Operation (I) a0 times, then our special case of Operation (II), then a1 times Operation (I) again, etc. The final step involves ak applications of Operation (I). The variables ai can be interpreted as the number of horizontal edges between subsequent “legs” of the caterpillar. Let the resulting tree be denoted by C(a0, …, ak); it clearly has a0 + a1 + … + ak + k + 1 vertices.

Straightforward induction yields the following lemma:

### Lemma 2

The number of subtrees of C(a0, …, ak) containing [resp. not containing] the last root (the rightmost vertex in Figure 3) is

$2k(1+∑j=0k2−jaj)[resp.k+∑j=0k(2jaj+aj(aj−1)2)+∑j=0k∑i=0j−12j−iaiaj].$

Thus the total number of subtrees of C(a0, …, ak) is given by

$2k+k+∑j=0k(2k−jaj+2jaj+aj(aj−1)2)+∑j=0k∑i=0j−12j−iaiaj.$
(1)

#### Proof

This is obviously true for a single vertex (k = 0, a0 = 0), and it is easy to see that it also holds for the path (a0 applications of Operation (I), k = 0 and a0 is arbitrary). Now the number of subtrees of C(a0, …, ak) can be calculated from C(a0, …, ak − 1) inductively:

After the application of our second operation, the pair becomes

$2k(1+∑j=0k−12−jaj)andk+∑j=0k−1(2jaj+aj(aj−1)2)+∑j=0k−1∑i=0j−12j−iaiaj,$

and ak-fold application of Operation (I) yields

$ak+2k(1+∑i=0k−12−iai)$

and

which completes the proof after some simplifications.

The formula for the number of subtrees C(a0, …, ak) represents a quadratic form, but it is not very practical as it stands. However, a carefully chosen substitution simplifies matters decisively: we set aj = xj − 2xj − 1, where $x−1=12$. Then, (1) reduces to

$2k+k+∑j=0k(2k−j(xj−2xj−1)+2j(xj−2xj−1)+(xj−2xj−1)(xj−2xj−1−1)2)+∑j=0k∑i=0j−12j−i(xi−2xi−1)(xj−2xj−1)=2k+k+∑j=0k(2k−j+2j)xj−∑j=−1k−1(2k−j+2j+2)xj+∑j=0kxj22−∑j=0k2xjxj−1+∑j=−1k−12xj2−∑j=0kxj2+∑j=−1k−1xj+∑j=0k∑i=0j−12j−ixixj−∑j=−1k−1∑i=0j2j−i+2xixj−∑j=0k∑i=−1j−22j−ixixj+∑j=−1k−1∑i=−1j−12j−i+2xixj=2k+k−3∑j=0k−12jxj+(2k+1)(xk−1)+∑j=0k−15xj2+xj2+xk2−xk2+1−∑j=0k2jxj+∑j=−1k−12j+2xj−4∑j=−1k−1xj2=k+xk(xk+1)2−∑j=0k−13xj2−xj2=xk(xk+1)2−∑j=0k−13xj2−xj−22.$

Note that $3x2−x2$ is a so-called pentagonal number. It is part of a more general theorem (see [10]) that every positive integer is a sum of five pentagonal numbers. However, it is not possible to apply this theorem to our problem, since we have aj ≥ 0 for all j, which yields the restrictions x0 ≥ 1 and xj ≥ 2xj − 1 for 1 ≤ jk. However, we can show the following, which will be sufficient for our purposes:

### Proposition 3

Every positive integer n ≥ 11034 can be written as

$n=∑j=0k−13xj2−xj−22,$

where x0 ≥ 1 and xj ≥ 2xj − 1 for 1 ≤ j ≤ k − 1.

#### Proof

We show that “for every integer ≥ 160, all integers within the closed interval [11034, ] can be written in the above form for some k, where in addition xk − 1”. This is done by induction on , the base case being 160 ≤ ≤ 319, which can be checked directly by means of a computer using a brute-force approach that just checks all the possibilities; a list of all numbers that are representable in this way is generated for all ≤ 319 in a recursive manner; then it only remains to check that the entire interval is contained. A Mathematica file containing all computations is available online at http://math.sun.ac.za/~swagner/inverseproblem. For the induction step, note that if n can be represented in the above form with xk − 1/2, then can be written in this form as well, where now k − 1 is replaced by k and xk = . Therefore, under the induction hypothesis, all numbers in the union

(for the inclusion, simply note that for ≥ 160) can be represented in the desired form, proving the claim.

An immediate consequence of the preceding proposition and its proof is that every integer within the interval

is the number of subtrees of some caterpillar tree, provided that ≥ 160 (simple take xk = 2 in the above formula). Taking the union over all ≥ 160 (where we note that subsequent intervals overlap, since

holds for all ≥ 1), we find that all integers ≥ 1527 can be represented in this way. An additional computer search shows that in fact all positive integers, except for 2, 4, 5, 7, 8, 9, 12, 13, 14, 16, 18, 19, 22, 23, 26, 27, 29, 31, 33, 35, 38, 39, 42, 43, 46, 48, 50, 52, 54, 60, 61, 64, 65, 68, 72, 77, 79, 85, 89, 93, 96, 123, 157 and 183, are the number of subtrees of some caterpillar (one merely has to check the aforementioned lists of representable numbers). Finally, it is not difficult to compute the number of subtrees of all trees with ≤ 18 vertices (by means of our recursions or directly in a brute-force manner) to obtain the final list. All computations are again available online. This is sufficient, since a tree with n vertices has at least $(n+12)$ subtrees (and thus at least 190 if the number of vertices is ≥ 19): consider any pair of two (possibly equal) vertices; then the unique path between them forms a subtree, proving that the total number of subtrees is indeed at least $(n+12)$.

## 4 Conclusion

On the meager evidence of two non-trivial tree parameters, the Wiener index and the number of subtrees of a tree, we tend to think that for most tree indices almost all natural numbers are values of that tree parameter, except when we see obvious reasons for the opposite (as in the case of chains, where only specific residue classes are covered). A general (but admittedly only heuristic) argument one is tempted to use is the following: if the number of trees with n vertices grows faster than the corresponding maximum value of a parameter (the number of non-isomorphic trees grows exponentially with an exponential base of 2.955765 —see [3]), then one would expect that almost all integer values (or at least those in certain residue classes) should be covered. A counterexample for this metaconjecture would be very instructive.

## Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

## Contributor Information

Éva Czabarka, Department of Mathematics, University of South Carolina, Columbia, SC 29208, United States of America.

László Székely, Department of Mathematics, University of South Carolina, Columbia, SC 29208, United States of America.

Stephan Wagner, Department of Mathematical Sciences, Mathematics Division, Stellenbosch University, Private Bag X1, Matieland 7602, South Africa.

## References

1. Dobrynin AA, Entringer R, Gutman I. Wiener index of trees: theory and applications. Acta Appl Math. 2001;66(3):211–249.
2. Entringer RC, Meir A, Moon JW, Székely LA. The Wiener index of trees from certain families. Australas J Combin. 1994;10:211–224.
3. Harary F, Palmer EM. Graphical enumeration. Academic Press; New York: 1973.
4. Heilmann OJ, Lieb EH. Theory of monomer-dimer systems. Commun Math Phys. 1972;25(3):190–232.
5. Klazar M. Twelve countings with rooted plane trees. European J Combin. 1997;18(2):195–210.
6. Lepović M, Gutman I. A Collective Property of Trees and Chemical Trees. J Chem Inf Comput Sci. 1998;38:823–826.
7. Li X, Li Z, Wang L. The inverse problems for some topological indices in combinatorial chemistry. J Computational Biology. 2003;10:47–55. [PubMed]
8. Linek V. Bipartite graphs can have any number of independent sets. Discrete Math. 1989;76(2):131–136.
9. Merrifield RE, Simmons HE. Topological Methods in Chemistry. Wiley; New York: 1989.
10. Nathanson MB. Additive number theory, volume 164 of Graduate Texts in Mathematics. Springer-Verlag; New York: 1996.
11. Székely LA, Wang H. On subtrees of trees. Adv in Appl Math. 2005;34(1):138–155.
12. Trinajstić N. Chemical graph theory. CRC Press; Boca Raton, FL: 1992.
13. Wagner S. A class of trees and its Wiener index. Acta Appl Math. 2006;91(2):119–132.
14. Wagner S, Wang H, Yu G. Molecular graphs and the inverse Wiener index problem. Discrete Applied Mathematics. to appear.
15. Wang H, Yu G. All but 49 numbers are Wiener indices of trees. Acta Appl Math. 2006;92(1):15–20.
16. Wilf HS. The number of maximal independent sets in a tree. SIAM J Algebraic Discrete Methods. 1986;7(1):125–130.

 PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers.