Inst Math Stat Collect. Author manuscript; available in PMC 2010 August 20.
Published in final edited form as:
Inst Math Stat Collect. 2009 January 1; 5: 103–107.
PMCID: PMC2924594
NIHMSID: NIHMS177738

# A note on bounds for VC dimensions

## Abstract

We provide bounds for the VC dimension of class of sets formed by unions, intersections, and products of VC classes of sets 1,…,m.

AMS 2000 subject classifications: Primary 60B99, secondary 62G30
Keywords and phrases: Vapnik-Chervonenkis class, combining classes, inequality, entropy

## 1. Introduction and main results

Let be a class of subsets of a set . An arbitrary set of n points {x1,…,xn} has 2n subsets. We say that picks out a certain subset from {x1,…,xn} if this can be formed as a set of the form C ∩ {x1,…,xn} for some C . The collection is said to shatter {x1,…,xn} if each of its 2n subsets can be picked out by . The VC - dimension V() is the largest cardinality of a set shattered by (or +∞ if arbitrarily large finite sets are shattered); more formally, if

then

and V() = −1 if is empty. (The VC-dimension V() defined here corresponds to S() as defined by [5] page 134. Dudley, and following him ourselves in [11], used the notation V() for the VC-index, which is the dimension plus 1. We have switched to using V() for the VC-dimension rather than the VC-index, because formulas are simpler in terms of dimension and because the machine learning literature uses dimension rather than index.)

Now suppose that 1, 2,…,m are VC-classes of subsets of a given set with VC dimensions V1,…,Vm. It is known that the classes , defined by

are again VC: when 1 = = m = and m = k, this is due to [2] (see also [3], Theorem 9.2.3, page 85, and [5], Theorem 4.2.4, page 141); for general 1, 2 and m = 2 it was shown by [3], Theorem 9.2.6, page 87, (see also [5], Theorem 4.5.3, page 153), and [9], Lemma 15, page 18. See also [8], Lemma 2.5, page 1032. For a summary of these types of VC preservation results, see e.g. [11], page 147. Similarly, if 1,…,m are VC-classes of subsets of sets 1,…,m, then the class of product sets defined by

is a VC-class of subsets of 1 × × m. This was proved in [1], Proposition 2.5, and in [3], Theorem 9.2.6, page 87 (see also [5], Theorem 4.2.4, page 141).

In the case of m = 2, consider the maximal VC dimensions max V(1 2), max V(1 2), and max V(1 2), where the maxima are over all classes 1, 2 (or 1, 2 in the last case) with V(1) = V1, V(2) = V2 for fixed V1, V2. As shown in [3], Theorem 9.2.7, these are all equal:

[3] provided the following bound for this common value:

Proposition 1.1. S(V1, V2) ≤ T(V1, V2) where, with ,

(1.1)

Because of the somewhat inexplicit nature of the bound in (1.1), this proposition seems not to have been greatly used so far.

Furthermore, [4] (Theorem 4.27, page 63; Proposition 4.38, page 64) showed that S(1, k) ≤ 2k + 1 for all k ≥ 1 with equality for k = 1, 2, 3.

Here we give a further more explicit bound for T(V1, V2) and extend the bounds to the case of general m ≥ 2. Our main result is the following proposition.

Theorem 1.1. Let . Then the following bounds hold:

(1.2)

where (V1,…,Vm),

is the “entropy” of the Vj’s under the discrete uniform distribution with weights 1/m and $V¯=m−1∑j=1mVj$.

Corollary 1.1. For m = 2 the following bounds hold:

where c1, c2, Ent(), and are as in Theorem 1.

Proof. The subsets picked out by ii from a given set of points {x1,…,xn} in are the sets C1Cm ∩ {x1,…,xn}. They can be formed by first forming all different sets of the form C1 ∩ {x1,…,xn} for C1 1, next intersecting each of these sets by sets in 2 giving all sets of the form C1C2 ∩ {x1,…,xn}, etc. If Δn(, y1,…,yn) #{C ∩{y1,…,yn} : C } and Δn() = maxy1,…,yn Δn(, y1,…,yn) for every collection of sets and points y1,…,yn (as in [11], page 135), then in the first step we obtain at most Δn(1) different sets, each with n or fewer points. In the second step each of these sets gives rise to at most Δn(2) different sets, etc. We conclude that

by [11], Corollary 2.6.3, page 136, and the bound (en/s)s for the number of subsets of size smaller than s for ns. By definition the left side of the display is 2n for n equal to the VC-dimension of ii. We conclude that

or

$nlog2≤∑i=1mVilog(e/Vi)+(∑i=1mVi)logn.$

With V i Vi, define r = en/V. Then the last display implies that

$rVlog2e≤∑iVilog(e/Vi)+Vlog(rV/e),$

or

and this inequality can in turn be rewritten as

Now note that g(x) x/ log xy for xe implies that x ≤ (e/(e − 1))y log y: g is minimized by x = e and is increasing; furthermore yg(x) for xe implies that

$logy≥logx−log logx=logx(1−log logxlogx)≥logx(1−1e)$

so that

$x≤ylogx≤y(1−1e)−1logy=ee−1ylogy.$

Thus we conclude that

$mreEnt(V¯)/V¯≤ee−1meeEnt(V¯)/V¯log2log(meEnt(V¯)/V¯·elog2),$

which implies that

$r≤e2(e−1)log2log(mexp(Ent(V¯)/V¯)·elog2).$

Expressing this in terms of n yields the first inequality (1.2). The second inequality holds since Ent() ≥ 0 implies exp(Ent()/) ≥ 1.

The corresponding statement for the unions follows because a class of sets and the class c of their complements possess the same VC-dimension, and .

In the case of products, note that

and then the rest of the proof proceeds as in the case of intersections.

It follows from concavity of x log x that with ,

$∑j=1mVjlogVj∑j=1mVj=∑1mpjlogVj≤log(∑1mpjVj)≤log(∑1mVj)$

and hence

$1≤meEnt(V¯)/V¯≤m,$
(1.3)

or 0 ≤ Ent()/ ≤ log m, or

$0≤Ent(V¯)≤V¯logm.$

Here are two examples showing that the quantity m/eEnt()/ can be very close to 1 (rather than m) if the Vi’s are quite heterogeneous, even if m is large.

Example 1.1. Suppose that r (large), and that Vi = ri for i = 1,…,m. Then it is not hard to show that

$meEnt(V¯)/V¯→rr−1r1/(r−1)=rr−1exp((r−1)−1logr)$

as m → ∞ where the right side can be made arbitrarily close to 1 by choosing r large.

Example 1.2. Suppose that m = 2 and that V1 = k, V2 = rk for some r . Then

$Ent(V¯)/V¯=log2−1r+1log((r+1)(1+1/r)r)→log2$

as r → ∞ for any fixed k. Therefore

$2eEnt(V¯)/V¯→1$

as r → ∞ for any fixed k.

Our last example shows that the bound of Theorem 1.1 may improve considerably on the bounds resulting from iteration of Dudley’s bound S(1, k) ≤ 2k + 1.

Example 1.3. Suppose V1 = V(1) = k and Vj = V(j) = 1 for j = 2,…,m. Iterative application of Dudley’s bound S(1, k) ≤ 2k + 1 yields , which grows exponentially as m → ∞. On the other hand, Theorem 1.1 yields which is of order c1m log m as m → ∞.

Although we have succeeded here in providing quantitative bounds for , , and , it seems that we are far from being able to provide quantitative bounds for the VC - dimensions of the (much larger) classes involved in [6], [7], and [10].

## Acknowledgement

We owe thanks to a helpful referee for pointing out [4] and for suggesting Example 1.3.

## Footnotes

*Supported in part by NSF Grant DMS-0804587, and by NI-AID grant 2R01 AI291968-04.

## References

1. Assouad P. Densité et dimension. Ann. Inst. Fourier (Grenoble) 1983;33:233–282.
2. Dudley RM. Central limit theorems for empirical measures. Ann. Probab. 1978;6:899–929. (1979)
3. Dudley RM. École d’été de probabilités de Saint-Flour, XII—1982. Lecture Notes in Math. Vol. 1097. Berlin: Springer; 1984. A course on empirical processes; pp. 1–142.
4. Dudley RM. Notes on Empirical Processes. MaPhySto Lecture Notes. Vol. 4 1999.
5. Dudley RM. Uniform Central Limit Theorems. Cambridge Studies in Advanced Mathematics. Vol. 63. Cambridge: Cambridge University Press; 1999.
6. Laskowski MC. Vapnik-Chervonenkis classes of definable sets. J. London Math. Soc. (2) 1992;45:377–384.
7. Olshen RA, Biden EN, Wyatt MP, Sutherland DH. Gait analysis and the bootstrap. Ann. Statist. 1989;17:1419–1440.
8. Pakes A, Pollard D. Simulation and the asymptotics of optimization estimators. Econometrica. 1989;57:1027–1057.
9. Pollard D. Convergence of Stochastic Processes. Springer Series in Statistics. New York: Springer-Verlag; 1984.
10. Stengle G, Yukich JE. Some new Vapnik-Chervonenkis classes. Ann. Statist. 1989;17:1441–1446.
11. van der Vaart AW, Wellner JA. Weak Convergence and Empirical Processes. Springer Series in Statistics. New York: Springer-Verlag; 1996. With applications to statistics.

 PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers.