Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Inst Math Stat Collect. Author manuscript; available in PMC 2010 August 20.
Published in final edited form as:
Inst Math Stat Collect. 2009 January 1; 5: 103–107.
doi:  10.1214/09-IMSCOLL508
PMCID: PMC2924594

A note on bounds for VC dimensions


We provide bounds for the VC dimension of class of sets formed by unions, intersections, and products of VC classes of sets C1,…,Cm.

AMS 2000 subject classifications: Primary 60B99, secondary 62G30
Keywords and phrases: Vapnik-Chervonenkis class, combining classes, inequality, entropy

1. Introduction and main results

Let C be a class of subsets of a set X. An arbitrary set of n points {x1,…,xn} has 2n subsets. We say that C picks out a certain subset from {x1,…,xn} if this can be formed as a set of the form C ∩ {x1,…,xn} for some C [set membership] C. The collection C is said to shatter {x1,…,xn} if each of its 2n subsets can be picked out by C. The VC - dimension V(C) is the largest cardinality of a set shattered by C (or +∞ if arbitrarily large finite sets are shattered); more formally, if

Δn(C,x1,,xn)=#{C{x1,,xn}:C[set membership]C},



and V(C) = −1 if C is empty. (The VC-dimension V(C) defined here corresponds to S(C) as defined by [5] page 134. Dudley, and following him ourselves in [11], used the notation V(C) for the VC-index, which is the dimension plus 1. We have switched to using V(C) for the VC-dimension rather than the VC-index, because formulas are simpler in terms of dimension and because the machine learning literature uses dimension rather than index.)

Now suppose that C1, C2,…,Cm are VC-classes of subsets of a given set X with VC dimensions V1,…,Vm. It is known that the classes [square union]j=1mCj, [square intersection]j=1mCj defined by

[square union]j=1mCj[equivalent]{[union or logical sum]j=1mCj:Cj[set membership]Cj,j=1,,m},[square intersection]j=1mCj[equivalent]{j=1mCj:Cj[set membership]Cj,j=1,,m},

are again VC: when C1 = (...) = Cm = C and m = k, this is due to [2] (see also [3], Theorem 9.2.3, page 85, and [5], Theorem 4.2.4, page 141); for general C1, C2 and m = 2 it was shown by [3], Theorem 9.2.6, page 87, (see also [5], Theorem 4.5.3, page 153), and [9], Lemma 15, page 18. See also [8], Lemma 2.5, page 1032. For a summary of these types of VC preservation results, see e.g. [11], page 147. Similarly, if D1,…,Dm are VC-classes of subsets of sets X1,…,Xm, then the class of product sets [multiply sign in box]j=1mDj defined by

[multiply sign in box]j=1mDj[equivalent]{D1××Dm:Dj[set membership]Dj,j=1,,m}

is a VC-class of subsets of X1 × (...) × Xm. This was proved in [1], Proposition 2.5, and in [3], Theorem 9.2.6, page 87 (see also [5], Theorem 4.2.4, page 141).

In the case of m = 2, consider the maximal VC dimensions max V(C1 [square union] C2), max V(C1 [square intersection] C2), and max V(D1 [multiply sign in box] D2), where the maxima are over all classes C1, C2 (or D1, D2 in the last case) with V(C1) = V1, V(C2) = V2 for fixed V1, V2. As shown in [3], Theorem 9.2.7, these are all equal:

maxV(C1[square union]C2)=maxV(C1[square intersection]C2)=maxV(D1[multiply sign in box]D2)[equivalent]S(V1,V2).

[3] provided the following bound for this common value:

Proposition 1.1. S(V1, V2) ≤ T(V1, V2) where, with Cvr[equivalent]j=0v(rj),

T(V1,V2)[equivalent]sup{r[set membership]N:CV1rCV2r2r}.

Because of the somewhat inexplicit nature of the bound in (1.1), this proposition seems not to have been greatly used so far.

Furthermore, [4] (Theorem 4.27, page 63; Proposition 4.38, page 64) showed that S(1, k) ≤ 2k + 1 for all k ≥ 1 with equality for k = 1, 2, 3.

Here we give a further more explicit bound for T(V1, V2) and extend the bounds to the case of general m ≥ 2. Our main result is the following proposition.

Theorem 1.1. Let V[equivalent]j=1mVj. Then the following bounds hold:

{V([square union]j=1mCj)V([square intersection]j=1mCj)V([multiply sign in box]1mDj)}c1Vlog(c2meEnt(V¯)/V¯)c1Vlog(c2m),

where [equivalent] (V1,…,Vm), c1[equivalent]e(e1)log(2)[equals, single dot above]2.28231,c2[equivalent]elog2[equals, single dot above]3.92165,


is the “entropy” of the Vj’s under the discrete uniform distribution with weights 1/m and V¯=m1j=1mVj.

Corollary 1.1. For m = 2 the following bounds hold:

S(V1,V2)T(V1,V2)[left floor]c1(V1+V2)log(2c2exp(Ent(V¯)/V¯))[right floor][equivalent]R(V1,V2)

where c1, c2, Ent(), and V are as in Theorem 1.

Proof. The subsets picked out by [square intersection]iCi from a given set of points {x1,…,xn} in X are the sets C1(...)Cm ∩ {x1,…,xn}. They can be formed by first forming all different sets of the form C1 ∩ {x1,…,xn} for C1 [set membership] C1, next intersecting each of these sets by sets in C2 giving all sets of the form C1C2 ∩ {x1,…,xn}, etc. If Δn(C, y1,…,yn) [equivalent] #{C ∩{y1,…,yn} : C [set membership] C} and Δn(C) = maxy1,…,yn Δn(C, y1,…,yn) for every collection of sets C and points y1,…,yn (as in [11], page 135), then in the first step we obtain at most Δn(C1) different sets, each with n or fewer points. In the second step each of these sets gives rise to at most Δn(C2) different sets, etc. We conclude that

Δn([square intersection]iCi)[product]iΔn(Ci)[product]i(enVi)Vi,

by [11], Corollary 2.6.3, page 136, and the bound (en/s)s for the number of subsets of size smaller than s for ns. By definition the left side of the display is 2n for n equal to the VC-dimension of [square intersection]iCi. We conclude that




With V [equivalent]i Vi, define r = en/V. Then the last display implies that



rlog2elogr+logViVilogViV  =logr+logmEnt(V¯)V¯=log(mreEnt(V¯)/V¯),

and this inequality can in turn be rewritten as


Now note that g(x) [equivalent] x/ log xy for xe implies that x ≤ (e/(e − 1))y log y: g is minimized by x = e and is increasing; furthermore yg(x) for xe implies that

logylogxlog logx=logx(1log logxlogx)logx(11e)

so that


Thus we conclude that


which implies that


Expressing this in terms of n yields the first inequality (1.2). The second inequality holds since Ent() ≥ 0 implies exp(Ent()/V) ≥ 1.

The corresponding statement for the unions follows because a class C of sets and the class Cc of their complements possess the same VC-dimension, and [union or logical sum]iCi=(iCic)c.

In the case of products, note that

Δn([multiply sign in box]1mDj)[product]1mΔn(Dj)[product]j=1m(enVj)Vj,

and then the rest of the proof proceeds as in the case of intersections.

It follows from concavity of x [mapsto] log x that with pj[equivalent]Vj/i=1mVi,


and hence


or 0 ≤ Ent()/V ≤ log m, or


Here are two examples showing that the quantity m/eEnt()/V can be very close to 1 (rather than m) if the Vi’s are quite heterogeneous, even if m is large.

Example 1.1. Suppose that r [set membership] N (large), and that Vi = ri for i = 1,…,m. Then it is not hard to show that


as m → ∞ where the right side can be made arbitrarily close to 1 by choosing r large.

Example 1.2. Suppose that m = 2 and that V1 = k, V2 = rk for some r [set membership] N. Then


as r → ∞ for any fixed k. Therefore


as r → ∞ for any fixed k.

Our last example shows that the bound of Theorem 1.1 may improve considerably on the bounds resulting from iteration of Dudley’s bound S(1, k) ≤ 2k + 1.

Example 1.3. Suppose V1 = V(C1) = k and Vj = V(Cj) = 1 for j = 2,…,m. Iterative application of Dudley’s bound S(1, k) ≤ 2k + 1 yields V([square intersection]j=1mCj)2m1(k+1)1, which grows exponentially as m → ∞. On the other hand, Theorem 1.1 yields V([square intersection]j=1mCj)c1(m+k1)log(c2m) which is of order c1m log m as m → ∞.

Although we have succeeded here in providing quantitative bounds for V([square union]j=1mCj), V([square intersection]j=1mCj), and V([multiply sign in box]1mDj), it seems that we are far from being able to provide quantitative bounds for the VC - dimensions of the (much larger) classes involved in [6], [7], and [10].


We owe thanks to a helpful referee for pointing out [4] and for suggesting Example 1.3.


*Supported in part by NSF Grant DMS-0804587, and by NI-AID grant 2R01 AI291968-04.


1. Assouad P. Densité et dimension. Ann. Inst. Fourier (Grenoble) 1983;33:233–282.
2. Dudley RM. Central limit theorems for empirical measures. Ann. Probab. 1978;6:899–929. (1979)
3. Dudley RM. École d’été de probabilités de Saint-Flour, XII—1982. Lecture Notes in Math. Vol. 1097. Berlin: Springer; 1984. A course on empirical processes; pp. 1–142.
4. Dudley RM. Notes on Empirical Processes. MaPhySto Lecture Notes. Vol. 4 1999.
5. Dudley RM. Uniform Central Limit Theorems. Cambridge Studies in Advanced Mathematics. Vol. 63. Cambridge: Cambridge University Press; 1999.
6. Laskowski MC. Vapnik-Chervonenkis classes of definable sets. J. London Math. Soc. (2) 1992;45:377–384.
7. Olshen RA, Biden EN, Wyatt MP, Sutherland DH. Gait analysis and the bootstrap. Ann. Statist. 1989;17:1419–1440.
8. Pakes A, Pollard D. Simulation and the asymptotics of optimization estimators. Econometrica. 1989;57:1027–1057.
9. Pollard D. Convergence of Stochastic Processes. Springer Series in Statistics. New York: Springer-Verlag; 1984.
10. Stengle G, Yukich JE. Some new Vapnik-Chervonenkis classes. Ann. Statist. 1989;17:1441–1446.
11. van der Vaart AW, Wellner JA. Weak Convergence and Empirical Processes. Springer Series in Statistics. New York: Springer-Verlag; 1996. With applications to statistics.