Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2860328

Formats

Article sections

- Abstract
- 1 Introduction
- 2 Existence and characterizations
- 3 Consistency
- 4 Asymptotic minimax risk lower bounds for the rates of convergence
- References

Authors

Related links

Stat Neerl. Author manuscript; available in PMC 2010 October 1.

Published in final edited form as:

Stat Neerl. 2010 February 1; 64(1): 45–70.

doi: 10.1111/j.1467-9574.2009.00438.xPMCID: PMC2860328

NIHMSID: NIHMS177740

Fadoua Balabdaoui, CEREMADE, Université Paris-Dauphine, Place du Maréchal de Lattre de Tassigny, 75775, Paris, CEDEX 16, France;

See other articles in PMC that cite the published article.

The classes of monotone or convex (and necessarily monotone) densities on ^{+} can be viewed as special cases of the classes of *k-monotone* densities on ^{+}. These classes bridge the gap between the classes of monotone (1-monotone) and convex decreasing (2-monotone) densities for which asymptotic results are known, and the class of completely monotone (∞-monotone) densities on ^{+}. In this paper we consider non-parametric maximum likelihood and least squares estimators of a *k*-monotone density *g*_{0}.We prove existence of the estimators and give characterizations. We also establish consistency properties, and show that the estimators are splines of degree *k* − 1 with simple knots. We further provide asymptotic minimax risk lower bounds for estimating the derivatives${g}_{0}^{(j)}({x}_{0}),0=1,\dots ,k-1$, at a fixed point *x*_{0} under the assumption that ${(-1)}^{k}{g}_{0}^{(k)}({x}_{0})>0$.

Densities with monotone or convex shape are encountered in many non-parametric estimation problems. Monotone densities arise naturally via connections with renewal theory and uniform mixing; see Vardi, (1989) and Woodroofe and Sun (1993), for examples of the former, and Woodroofe and Sun (1993), for the latter in an astronomical context. Estimation of monotone densities on (0, ∞) was initiated by Grenander (1956a,b) with related work by Ayer *et al.* (1955), Brunk (1958), and Van Eeden (1957a,b). Asymptotic theory of the maximum likelihood estimation (MLE) was developed by Prakasa Rao (1969)with later contributions by Groeneboom (1985, 1989), and Kim and Pollard (1990).

Convex densities arise in connection with Poisson process models for bird migration and scale mixtures of triangular densities; see, for example, Hampel, (1987) and Anevski, (2003). Estimation of convex densities on (0, ∞) was apparently initiated by Anevski (1994) (see also Anevski, 2003), and was pursued by Jongbloed (1995). The limit distribution theory for the MLE and least square (LS) estimators and their first derivative at a fixed point was obtained by Groeneboom, Jongbloed, and Wellner (2001). For consistent estimation of the estimators at the origin, see. Balabdaoui (2007).

Estimation in the class of *k*-monotone densities on ^{+}, denoted hereafter by * _{k}*, has been very recently considered in Balabdaoui and Wellner (2007) and has several motivating components. By definition,

In Balabdaoui and Wellner (2007), the joint limit distribution theory for the MLE and LSE of a *k*-monotone density and their higher derivatives up to degree *k* − 1 at a fixed point is established modulo a spline conjecture. The rate of convergence of the *j*-th derivative, *j* = 0,…,*k* − 1 is shown to be *n*^{(k−j)/(2k + 1)}. Note that these rates coincide with the minimax lower bounds obtained here. As for the joint limiting distribution, it depends on a Gaussian process *H _{k}* defined uniquely almost surely as follows:

*H*(_{k}*t*)≥*Y*(_{k}*t*),*t*.- (−1)
is 2^{k}H_{k}*k*-convex; that is, ${(-1)}^{k}{H}_{k}^{(2k-2)}$ exists and is convex. - The process
*H*satisfies_{k}

$${\int}_{-\mathrm{\infty}}^{\mathrm{\infty}}({H}_{k}(t)-{Y}_{k}}(t))\mathrm{d}{H}_{k}^{(2k-1)}(t)=0,$$

where *Y _{k}* is the (

Existence of the MLE and LSE of a *k*-monotone density, their characterization, their structure (splines of degree *k* − 1 and with simple knots), and consistency of their derivatives up to degree *k* − 1 are used in Balabdaoui and Wellner (2007). In this paper, we give proofs of those essential properties in sections 2 and 3. In section 4, we establish asymptotic minimax lower bounds for estimation of ${g}_{0}^{(j)}({x}_{0}),j=0,\dots ,k-1$ under the assumption that ${g}_{0}^{(k)}({x}_{0})$ exists and is non-zero.

In the sequel, *X*_{1}, …, *X _{n}* are i.i.d. random variables with density

Lemma 1 characterizing integrable *k*-monotone functions and giving an inversion formula follows from the results of Williamson (1956).

*(Integrable k-monotone characterization) A function g is an integrable k-monotone function if and only if it is of the form*

$$g(x)={\displaystyle {\int}_{0}^{\mathrm{\infty}}\frac{k{(t-x)}_{+}^{k-1}}{{t}^{k}}}\mathrm{d}F(t),\phantom{\rule{thinmathspace}{0ex}}x>0$$

(1)

*where F is non-decreasing and bounded on (0, ∞). Thus g is a k-monotone density if and only if it is of the form of Equation 1 for some distribution function F on (0, ∞). If F in Equation 1 satisfies*${\text{lim}}_{t\to \mathrm{\infty}}F(t)={\displaystyle {\int}_{0}^{\mathrm{\infty}}g(x)\mathrm{d}x}$*, then at a continuity point t > 0, F is given by*

$$F(t)=G(t)-tg(t)+\cdots +\frac{{(-1)}^{k-1}}{(k-1)!}{t}^{k-1}{g}^{(k-2)}(t)+\frac{{(-1)}^{k}}{k!}{t}^{k}{g}^{(k-1)}(t),$$

(2)

*where* $G(t)={\displaystyle {\int}_{0}^{t}g(x)\mathrm{d}x}$.

The representation in equation 1 follows from theorem 5 of Lévy (1962) by taking *k* = *n* + 1 and *f* 0 on (−∞, 0]. The inversion formula 2 follows from lemma 1 in Williamson (1956) together with an integration by parts argument.

For *k* = 1 (*k* = 2), note that the characterization matches with the well-known fact that a density is non-decreasing (non-decreasing and convex) on (0, ∞) if and only if it is a mixture of uniform densities (triangular densities). More generally, the characterization establishes a one-to-one correspondence between the class of *k*-monotone densities and the class of scale mixture of beta densities with parameters 1 and *k*. From the inversion formula in equation 2, one can see that a natural estimator for the mixing distribution *F* is obtained by plugging in an estimator for the density *g* and it becomes clear that the rate of convergence of estimators of *F* will be controlled by the corresponding rate of convergence for estimators of the highest derivative *g*^{(k−1)} of *g*. When *k* increases the densities become smoother, and therefore the inverse problem of estimating the mixing distribution *F* becomes harder.

We now consider the MLE and LSE of a *k*-monotone density *g*_{0}. We show that these estimators exist and give characterizations thereof. In the following, λ is the Lebesgue measure, 𝓜_{k} is the class of all *k*-monotone functions on (0, ∞), and 𝓜_{k} *L*_{1}(λ) is the class of all integrable *k*-monotone functions. Note that * _{k}* 𝓜

Let

$${l}_{n}(g)={\displaystyle {\int}_{0}^{\mathrm{\infty}}\text{log}}\phantom{\rule{thinmathspace}{0ex}}g(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{n}(x)$$

be the log-likelihood function (really *n*^{−1} times the log-likelihood function). We want to maximize *l _{n}*(

$${\psi}_{n}(g)={\displaystyle {\int}_{0}^{\mathrm{\infty}}\text{log}}\phantom{\rule{thinmathspace}{0ex}}g(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{n}(x)-{\displaystyle {\int}_{0}^{\mathrm{\infty}}g(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x,}$$

for *g* 𝓜* _{k}* ∩

$${\tilde{\psi}}_{n}(F)={\displaystyle {\int}_{0}^{\mathrm{\infty}}\text{log}\phantom{\rule{thinmathspace}{0ex}}\left({\displaystyle {\int}_{0}^{\mathrm{\infty}}\frac{k{(t-x)}_{+}^{k-1}}{{t}^{k}}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}F(t)\right)\phantom{\rule{thinmathspace}{0ex}}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{n}(x)-{\displaystyle {\int}_{0}^{\mathrm{\infty}}{\displaystyle {\int}_{0}^{\mathrm{\infty}}\frac{k{(t-x)}_{+}^{k-1}}{{t}^{k}}\mathrm{d}F(t)\mathrm{d}x}}$$

over the space of bounded and non-increasing functions *F* on (0, ∞).

*The maximizer ĝ _{n} of ψ_{n} over 𝓜_{k} ∩L_{1}(λ) exists and belongs to _{k} (and hence is a density). Furthermore, ĝ_{n} is of the form*

$${\widehat{g}}_{n}(x)={\widehat{w}}_{1}\frac{k{({\widehat{a}}_{1}-x)}_{+}^{k-1}}{{\widehat{a}}_{1}^{k}}+\cdots +{\widehat{w}}_{m}\frac{k{({\widehat{a}}_{m}-x)}_{+}^{k-1}}{{\widehat{a}}_{m}^{k}},$$

*where m \ {0}, and ŵ _{1}, , ŵ_{m} and â_{1}, ,â_{m} are respectively the weights and the support points of the maximizing (discrete) mixing distribution _{n}.*

It follows from lemma 2.2 that the MLE *ĝ _{n}* is a

It can be shown that the support points of the mixing distribution * _{n}* fall stricty between the order statistics

From Lindsay (1983), we conclude that there exists a unique maximizer of *l _{n}* and the maximum is achieved by a finite mixture of at most

By arguing as in Groeneboom *et al.* (2001, p. 1662), let *g* 𝓜* _{k}* ∩

$$\begin{array}{cc}{\psi}_{n}({\widehat{f}}_{n})-{\psi}_{n}(g)\hfill & ={\displaystyle {\int}_{0}^{\mathrm{\infty}}\text{log}\phantom{\rule{thinmathspace}{0ex}}{\widehat{f}}_{n}(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{n}(x)-1-{\displaystyle {\int}_{0}^{\mathrm{\infty}}\text{log}\phantom{\rule{thinmathspace}{0ex}}g(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{n}(x)+c}}\hfill \\ \hfill & ={l}_{n}({\widehat{f}}_{n})-{l}_{n}(g/c)-\text{log}\phantom{\rule{thinmathspace}{0ex}}c+c-1\ge -\text{log}\phantom{\rule{thinmathspace}{0ex}}c+c-1\ge 0.\hfill \end{array}$$

Hence, *ĝ _{n}* =

Considering maximization over the bigger set 𝓜* _{k}* ∩

Lemma 3 gives a necessary and sufficient condition for a function *ĝ _{n}* 𝓜

For *k* ≥ 3 it generalizes lemma 2.4 of Groeneboom *et al.* (2001).

*Let X _{1}, , X_{n} be i.i.d. random variables from the true density g_{0}. A k-monotone spline ĝ_{n} of degree k − 1 and simple knots â_{1}, , â_{m} is the MLE if and only if for all t > 0*

$${\widehat{H}}_{n}(t)\equiv {\displaystyle {\int}_{0}^{\mathrm{\infty}}\frac{k{(t-x)}_{+}^{k-1}}{{t}^{k}{\widehat{g}}_{n}(x)}\mathrm{d}{\mathbb{G}}_{n}(x)\le 1}$$

(3)

$$=1,\mathit{\text{ift}}\in \{{\widehat{a}}_{1},\cdots ,{\widehat{a}}_{m}\}.$$

(4)

See the Appendix.

Note that *t* is a knot in {*â*_{1}, , *â _{m}*} if and only if ${(-1)}^{k-1}{\widehat{g}}_{n}^{(k-1)}(t-)<{(-1)}^{k-1}{\widehat{g}}_{n}^{(k-1)}(t+)$. Thus, the equality condition in Equation 4 can re-expressed in terms of the left and right (

The MLE *ĝ _{n}* can be computed by means of the support reduction algorithm of Groeneboom, Jongbloed, and Wellner (2008); also see Baladaoui and Wellner (2004) for further details.

Now, we briefly consider the LSE. The LS criterion is:

$${Q}_{n}(g)=\frac{1}{2}{\displaystyle {\int}_{0}^{\mathrm{\infty}}{g}^{2}(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x-{\displaystyle {\int}_{0}^{\mathrm{\infty}}g(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{n}(x).}}$$

(5)

We want to minimize this over *g* * _{k}* ∩

In this section, we will prove that both the MLE and LSE are strongly consistent. Furthermore, we will show that this consistency is uniform on intervals of the form [*c*, ∞), where *c* > 0.

Consistency of the MLEs for the classes * _{k}* in the sense of Hellinger convergence of the mixed density is a relatively simple straightforward consequence of the methods of Pfanzagl (1988) and van de Geer (1993). As usual, the Hellinger distance

Suppose that *ĝ _{n}* is the MLE of

$$H({\widehat{g}}_{n},{g}_{0}){\to}_{a.s.}0\phantom{\rule{thinmathspace}{0ex}}\text{as}\phantom{\rule{thinmathspace}{0ex}}n\to \mathrm{\infty}.$$

Furthermore, * _{n}* →

Note that math _{k} = {*g _{F}* :

$${g}_{F}(x)=g(x)={\displaystyle {\int}_{0}^{\mathrm{\infty}}{k}_{x}(t)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}F(t)}$$

and *k _{x}* is the the scaled beta(1,

$$\mathcal{G}=\left\{\frac{gF}{gF+g{F}_{0}},F\phantom{\rule{thinmathspace}{0ex}}\text{is a d.f.on}\phantom{\rule{thinmathspace}{0ex}}(0,\mathrm{\infty})\right\}$$

is continuous in *F* with respect to the vague topology for every *x* > 0. Now, as the family of sub-distributions *F* on (0, ∞) is compact for the vague topology (see, e.g., Bauer, 1981), and the class is uniformly bounded by 1, we conclude by lemma 5.1 of van der Geer (1993) that *g* is *P*_{0}-Glivenko–Cantelli. It follows by corollary 1 of van der Vaart and Wellner (2000) that *H*(*ĝ _{n}*,

Lemma 4 establishes a useful bound for *k*-monotone densities.

*If g is a k-monotone density function for k ≥ 2, then*

$$g(x)\le \frac{1}{x}{\left(1-\frac{1}{k}\right)}^{k-1}$$

*for all* *x* > 0.

We have

$$\begin{array}{cc}g(x)& ={\displaystyle {\int}_{x}^{\mathrm{\infty}}\frac{k}{{y}^{k}}{(y-x)}^{k-1}\mathrm{d}F(y)=\frac{1}{x}{\displaystyle {\int}_{x}^{\mathrm{\infty}}\frac{kx}{y}}{\phantom{\rule{thinmathspace}{0ex}}\left(1-\frac{x}{y}\right)}^{k-1}\mathrm{d}F(y)}\hfill \\ & \le \frac{1}{x}{\text{sup}}_{x\le y<\mathrm{\infty}}\frac{kx}{y}{\left(1-\frac{x}{y}\right)}^{k-1}=\frac{k}{x}{\text{sup}}_{0<u\le 1}u{(1-u)}^{k-1}=\frac{1}{x}{\left(1-\frac{1}{k}\right)}^{k-1}\hfill \end{array}$$

by an easy calculation. [Note that when *k* = 2, this bound equals 1/(2*x*) which agrees with the bound given by Jongbloed (1995, p. 117) and Groeneboom *et al.*, (2001, p. 1669) in this case.]

*Let c > 0. Then for j = 0, 1, , k − 2*

$$\underset{x\epsilon \left[c,\mathrm{\infty}\right)}{\text{sup}}|{\widehat{g}}_{n}^{(j)}(x)-{g}_{0}^{(j)}(x)|{\to}_{a.s.}0,\phantom{\rule{thinmathspace}{0ex}}\text{as}\phantom{\rule{thinmathspace}{0ex}}n\to \mathrm{\infty},$$

*and for each x* > 0 *at which g*_{0} *is (k − 1)-times differentiable,* ${\widehat{g}}_{n}^{(k-1)}(x){\to}_{a.s.}{g}_{0}^{(k-1)}(x)$.

Using the first part in the characterization of the MLE, we have

$${\int}_{0}^{\mathrm{\infty}}\frac{{g}_{0}(x)}{{\widehat{g}}_{n}(x)}\mathrm{d}{\mathbb{G}}_{n}(x)\le 1.$$

(6)

Let * _{n}* denote again the MLE of the mixing distribution. By the Helly–Bray theorem, there exists a subsequence {

$$\widehat{g}(x)={\displaystyle {\int}_{0}^{\mathrm{\infty}}\frac{k{(t-x)}_{+}^{k-1}}{{t}^{k}}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\widehat{F}(t),\phantom{\rule{thinmathspace}{0ex}}x>0.$$

The previous convergence is uniform on [*c*, ∞), *c* > 0. This follows as *ĝ _{l}* and

Using the inequality 6 we can show that the limit *ĝ* and *g*_{0} have to be the same, which implies the consistency result. The proof follows along the lines of Groeneboom *et al.* (2001, p. 1674–1675; see the Appendix). Consistency of the higher derivatives can be shown recursively using convexity of ${(-1)}^{j}{\widehat{g}}_{n}^{(j)}$ for *j* = 1, …, *k* − 1 in the same way as in the proof of lemma 3.1 of Groeneboom *et al.* (2001): for small *h* > 0, convexity of ${(-1)}^{j}{\widehat{g}}_{n}^{(j)}$ allows us to write, for *j* = 0, …, *k* −2,

$$\begin{array}{c}\frac{{(-1)}^{j}{\widehat{g}}_{n}^{(j)}(x-h)-{(-1)}^{j}{\widehat{g}}_{n}^{(j)}(x)}{-h}\hfill \\ \le {(-1)}^{j}{\widehat{g}}_{n}^{(j+1)}(x-)\le {(-1)}^{j}{\widehat{g}}_{n}^{(j+1)}(x+)\hfill \\ \le \frac{{(-1)}^{j}{\widehat{g}}_{n}^{(j)}(x+h)-{(-1)}^{j}{\widehat{g}}_{n}^{(j)}(x)}{h}.\hfill \end{array}$$

By letting *n* → ∞, this implies that

$$\begin{array}{c}\frac{{(-1)}^{j}{g}_{0}^{(j)}(x-h)-{(-1)}^{j}{g}_{0}^{(j)}(x)}{-h}\hfill \\ \le \underset{n\to \mathrm{\infty}}{\text{lim}\phantom{\rule{thinmathspace}{0ex}}\text{inf}}{(-1)}^{j}{\widehat{g}}_{n}^{(j+1)}(x-)\le \underset{n\to \mathrm{\infty}}{\text{lim}\phantom{\rule{thinmathspace}{0ex}}\text{sup}}{(-1)}^{j}{\widehat{g}}_{n}^{(j+1)}(x+)\hfill \\ \le \frac{{(-1)}^{j}{g}_{0}^{(j)}(x+h)-{(-1)}^{j}{g}_{0}^{(j)}(x)}{h}.\hfill \end{array}$$

By letting *h* 0, we conclude consistency of${\widehat{g}}_{n}^{(j)}(x),j=0,\dots ,k-1$ for *x* (0, ∞). Note that consistency of ${(-1)}^{j}{\widehat{g}}_{n}^{(j)},j=1,\dots ,k-2$ is uniform on intervals of the form [*c*, ∞) because of continuity of those derivatives. For *k* − 1, only pointwise strong consistency of ${(-1)}^{k-1}{\widehat{g}}_{n}^{(k-1)}$ can be claimed.

We also have strong and uniform consistency of the LSE * _{n}* on intervals of the form [

In this section, our goal is to derive minimax lower bounds for the behavior of *any estimator* of a *k*-monotone density *g* and its first *k* − 1 derivatives at a point *x*_{0} for which the *k*-th derivative exists and is non-zero. The proof will rely on the basic lemma 4.1 of Groeneboom (1996); see also Jongbloed (2000). This basic method seems to go back to Donoho and Liu (1987, 1991).

As before, let * _{k}* denote the class of

$${\text{MMR}}_{1}(n,T,{\mathcal{D}}_{k,n})={\text{inf}}_{{t}_{n}}{\text{sup}}_{g\in {\mathcal{D}}_{k,n}}{E}_{g}|{\widehat{T}}_{n}-{T}_{g}|.$$

Here the infimum ranges over all possible measurable functions *t _{n}* :

$${\mathcal{D}}_{k,n}\equiv {\mathcal{D}}_{k,n,\tau}=\left\{g\in {\mathcal{D}}_{k}:{H}^{2}(g,{g}_{0})=\frac{1}{2}{\displaystyle {\int}_{0}^{\mathrm{\infty}}{(\sqrt{g(x)}-\sqrt{{g}_{0}(x)})}^{2}\mathrm{d}x\le \tau /n}\right\}.$$

The behavior, for *n* → ∞ of such a local minimax risk MMR_{1} will depend on *n* (rate of convergence to zero) and the density *g*_{0} toward which the subclasses shrink. Lemma 5 is the basic tool for proving such a lower bound.

*Assume that there exists some subset* {*g*_{ε} : ε > 0} *of densities in _{k,n}such that, as* ε ↓ 0,

$${H}^{2}({g}_{\epsilon},{g}_{0})\le \epsilon (1+o(1))\phantom{\rule{thinmathspace}{0ex}}\text{and}\phantom{\rule{thinmathspace}{0ex}}|T{g}_{\epsilon}-T{g}_{0}|\ge {(c\epsilon )}^{r}(1+o(1))$$

*for some c > 0 and r > 0. Then*

$$\underset{\tau >0}{\text{sup}}\phantom{\rule{thinmathspace}{0ex}}\underset{n\to \mathrm{\infty}}{\text{lim}\phantom{\rule{thinmathspace}{0ex}}\text{inf}}{n}^{r}{\text{MMR}}_{1}(n,T,{\mathcal{D}}_{k,n,\tau})\ge \frac{1}{4}{\left(\frac{\mathit{\text{cr}}}{2e}\right)}^{r}.$$

*Let g _{0} _{k} and x_{0} be a fixed point in* (0, ∞)

$$\underset{\tau >0}{\text{sup}}\phantom{\rule{thinmathspace}{0ex}}\underset{n\to \mathrm{\infty}}{\text{lim}\phantom{\rule{thinmathspace}{0ex}}\text{inf}}{n}^{\frac{k-j}{2k+1}}{\text{MMR}}_{1}(n,{T}_{j},{\mathcal{D}}_{k,n,\tau})\ge {\left\{{|{g}_{0}^{(k)}({x}_{0})|}^{2j+1}{g}_{0}{({x}_{0})}^{k-j}\right\}}^{1/(2k+1)}{d}_{k,j},$$

*where d _{k,j} > 0, j {0, …, k − 1}. Here*

$${d}_{k,j}=\frac{1}{4}{\left(4\frac{k-j}{2k+1}{e}^{-1}\right)}^{(k-j)/(2k+1)}\frac{{\lambda}_{k,1}^{(j)}}{{({\lambda}_{k,2})}^{(k-j)/(2k+1)}},$$

*where*

$${\lambda}_{k,2}=\{\begin{array}{cc}{2}^{4(k+1)}\frac{(2k+3)(k+2)}{{(k+1)}^{2}}\frac{{((2(k+1))!)}^{2}}{(4k+7)!{((k-1)!)}^{2}\phantom{\rule{thinmathspace}{0ex}}{\left((\begin{array}{c}k\\ k/2-1\end{array}\right))}^{2}},\hfill & k\phantom{\rule{thinmathspace}{0ex}}\text{even}\hfill \\ {2}^{4(k+2)}(2k+3)(k+2)\frac{((2(k+1))!{/}^{2}}{(4k+7)!{(k!)}^{2}{\left((\begin{array}{c}k+1\\ (k-1)/2\end{array}\right))}^{2}},\hfill & k\phantom{\rule{thinmathspace}{0ex}}\text{odd}.\hfill \end{array}$$

Proposition 3 also yields lower bounds for estimation of the corresponding mixing distribution function *F* at a fixed point.

*Let g _{0} _{k} and let x_{0} be a fixed point in (0, ∞) such that g_{0} is k-times continuously differentiable at x_{0}, k ≥ 2. Then, for estimating T g_{0} = F_{0}(x_{0}) where F_{0} is given in terms of g_{0} by 2,*

$$\begin{array}{c}\underset{\tau >0}{\text{sup}}\phantom{\rule{thinmathspace}{0ex}}\underset{n\to \mathrm{\infty}}{\text{lim inf}}\phantom{\rule{thinmathspace}{0ex}}{n}^{1/2k+1}{\text{MMR}}_{1}(n,T,{\mathcal{D}}_{k,n,\tau})\hfill \\ \ge {\left\{{|{g}_{0}^{(k)}({x}_{0})|}^{2k-1}{g}_{0}({x}_{0})\right\}}^{1/(2k+1)}\frac{{x}_{0}^{k}}{k!}{d}_{k,k-1.}\hfill \end{array}$$

See the Appendix.

Both the rates of convergence *n*^{(k−j)/(2k + 1)} and the dependence of our lower bound on the constants *g*_{0}(*x*_{0}) and ${g}_{0}^{(k)}({x}_{0})$ match with the known results for *k* = 1 and *k* = 2 owing to Groeneboom (1985) and Groeneboom *et al.*. (2001), and reappears in the limit distribution theory for *k* ≥ 3 in Balabdaoui and Wellner (2007).

The research of the second author was supported in part by NSF grants DMS-0203320 and DMS-0503822, and by NI-AID grant 2R01 AI291968-04. The authors gratefully acknowledge helpful conversations with Carl de Boor, Nira Dyn, Tilmann Gneiting and Piet Groeneboom.

The arguments generalize those in the proof of lemma 2.4 of Groeneboom *et al.* (2001). If *ĝ _{n}* is the MLE, let
${g}_{t}(x)=k{(t-x)}_{+}^{k-1}/{t}^{k}$ for some

$$\underset{\epsilon \searrow 0}{\text{lim}}\frac{1}{\epsilon}({\psi}_{n}({\widehat{g}}_{n}+\epsilon {g}_{t})-{\psi}_{n}({\widehat{g}}_{n}))\le 0\iff {\displaystyle {\int}_{0}^{\mathrm{\infty}}\frac{k{(t-x)}_{+}^{k-1}/{t}^{k}}{{\widehat{g}}_{n}(x)}\mathrm{d}{\mathbb{G}}_{n}(x)-1\le 1}$$

yielding the inequality in Equation 3.
If *t* {*â*_{1}, , *â _{m}*}, then for ε such that |ε| is small enough,

$$\underset{\epsilon \to 0}{\text{lim}}\frac{1}{\epsilon}({\psi}_{n}({\widehat{g}}_{n}+\epsilon {g}_{t})-{\psi}_{n}({\widehat{g}}_{n}))=0\iff {\displaystyle {\int}_{0}^{\mathrm{\infty}}\frac{k{(t-x)}_{+}^{k-1}/{t}^{k}}{{\widehat{g}}_{n}(x)}\mathrm{d}{\mathbb{G}}_{n}(x)-1=0}$$

yielding the identity in Equation 4.

Suppose now that *ĝ _{n}* is a

$$g(x)={\displaystyle {\int}_{0}^{\mathrm{\infty}}\frac{k{(t-x)}_{+}^{k-1}}{{t}^{k}}\mathrm{d}F(t).}$$

We can write

$$\begin{array}{cc}{\psi}_{n}({\widehat{g}}_{n})-{\psi}_{n}(g)\hfill & ={\displaystyle {\int}_{0}^{\mathrm{\infty}}\mathrm{log}\phantom{\rule{thinmathspace}{0ex}}\left(\frac{{\widehat{g}}_{n}(x)}{g(x)}\right)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{n}(x)-1+{\displaystyle {\int}_{0}^{\mathrm{\infty}}g(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x}}\hfill \\ \hfill & \ge {\displaystyle {\int}_{0}^{\mathrm{\infty}}\left(\frac{1-g(x)}{{\widehat{g}}_{n}(x)}\right)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{n}(x)-1+{\displaystyle {\int}_{0}^{\mathrm{\infty}}g(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x}}\hfill \end{array}$$

using the inequality log *z* ≥ 1 − 1/*z*, *z* > 0

$$\begin{array}{c}=-{\displaystyle {\int}_{0}^{\mathrm{\infty}}\frac{g(x)}{{\widehat{g}}_{n}(x)}\mathrm{d}{\mathbb{G}}_{n}(x)+{\displaystyle {\int}_{0}^{\mathrm{\infty}}g(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x}}\hfill \\ =-{\displaystyle {\int}_{0}^{\mathrm{\infty}}{\displaystyle {\int}_{0}^{\mathrm{\infty}}\frac{k{(t-x)}_{+}^{k-1}}{{t}^{k}{\widehat{g}}_{n}(x)}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}F(t)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{n}(x)+{\displaystyle {\int}_{0}^{\mathrm{\infty}}\mathrm{d}F(t)}}\hfill \\ ={\displaystyle {\int}_{0}^{\mathrm{\infty}}\left(-{\displaystyle {\int}_{0}^{\mathrm{\infty}}\frac{k{(t-x)}_{+}^{k-1}}{{t}^{k}{\widehat{g}}_{n}(x)}\mathrm{d}{\mathbb{G}}_{n}(x)+1}\right)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}F(t)\ge 0\phantom{\rule{thinmathspace}{0ex}}\text{by Equation}\phantom{\rule{thinmathspace}{0ex}}3.}\hfill \end{array}$$

Hence, *ĝ _{n}* is the MLE.

In this optimization problem, existence requires more work because there is no available theory as in the case of the MLE. However, we will show that even though the resulting estimator does not necessarily have total mass one, it does have total mass converging almost surely to one and it consistently estimates *g*_{0} * _{k}*.

Using arguments similar to those in the proof of theorem 1 in Williamson (1956), one can show that *g* 𝓜* _{k}* if and only if

$$g(x)={\displaystyle {\int}_{0}^{\mathrm{\infty}}{(t-x)}_{+}^{k-1}\mathrm{d}\mu (t)}$$

for a positive measure *μ* on (0, ∞). Thus, we can rewrite the criterion *Q _{n}* in terms of the corresponding measures

$$\int}_{0}^{\mathrm{\infty}}{g}^{2}(x)\mathrm{d}x=}{\displaystyle {\int}_{0}^{\mathrm{\infty}}{\displaystyle {\int}_{0}^{\mathrm{\infty}}{r}_{k}(t,t\prime )\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t\prime )$$

where${r}_{k}(t,t\prime )={\displaystyle {\int}_{0}^{t\mathrm{\Lambda}t\prime}{(t-x)}^{k-1}{(t\prime -x)}^{k-1}\phantom{\rule{thinmathspace}{0ex}}dx}$, and

$$\int}_{0}^{\mathrm{\infty}}g(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{n}(x)={\displaystyle {\int}_{0}^{\mathrm{\infty}}{\displaystyle {\int}_{0}^{\mathrm{\infty}}{(t-x)}_{+}^{k-1}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{n}(x)={\displaystyle {\int}_{0}^{\mathrm{\infty}}{s}_{n,k}(t)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t),}}$$

where${s}_{n,k}(t)\equiv {\displaystyle {\int}_{0}^{\mathrm{\infty}}{(t-x)}_{+}^{k-1}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{n}(x)}$. Hence it follows that, with
*g* = *g*_{μ}

$${Q}_{n}(g)=\frac{1}{2}{\displaystyle {\int}_{0}^{\mathrm{\infty}}{\displaystyle {\int}_{0}^{\mathrm{\infty}}{r}_{k}(t,t\prime )\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t\prime )}-{\displaystyle {\int}_{0}^{\mathrm{\infty}}{s}_{n,k}(t)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t)\equiv {\mathrm{\Phi}}_{n}(\mu ).}}$$

Now, we want to minimize Φ* _{n}* over the set χ of all non-negative measures

The functional Φ_{n} admits a unique minimizer , and hence the LSE _{n} exists and is unique.

Uniqueness follows from strict convexity of Φ* _{n}*. To prove existence, it can be shown that Φ

We begin by checking the hypotheses of Zeidler’s theorem 38.B (Zeidler, p. 152 1985). We identity *X* of Zeidler’s theorem with the space χ of non-negative measures on [0, ∞), and we show that we can take *M* of Zeidler’s theorem to be

$$\mathcal{C}\equiv \left\{\mu \in \chi :\mu (t,\mathrm{\infty})\le D{t}^{-(k-1/2)}\right\}$$

for some constant *D* < ∞.

First, we can, without loss, restrict the minimization to the space of non-negative measures on [*X*_{(1)}, ∞), where *X*_{(1)} > 0 is the first-order statistic of the data. To see this, note that we can decompose any measure *μ* as *μ* = *μ*_{1} + *μ*_{2}, where *μ*_{1} is concentrated on [0, *X*_{(1)}) and *μ*_{2} is concentrated on [*X*_{(1)}, ∞). As the second term of *Q _{n}* is zero for

We can restrict further to measures *μ* with ${\int}_{0}^{\mathrm{\infty}}{t}^{k-1}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t)\le D$ for some finite *D* = *D*_{ω}. To show this, we first give a lower bound for *r _{k}*(

$${r}_{k}(s,t)\ge \frac{(1-{e}^{-{v}_{0}}){t}_{0}}{2k}{s}^{k-1}{t}^{k-1},$$

(A.1)

where *v*_{0} ≈ 1.59. To prove Equation A.1, we will use the inequality

$${(1-v/k)}^{k-1}\ge {e}^{-v},\text{\hspace{1em}\hspace{1em}}0\le v\le {v}_{0},\phantom{\rule{thinmathspace}{0ex}}k\ge 2.$$

(A.2)

[This inequality holds by straightforward computation; see Hall and Wellner (1979), especially their proposition 2.]

Thus, we compute

$$\begin{array}{cc}{r}_{k}(s,t)\hfill & ={\displaystyle {\int}_{0}^{\mathrm{\infty}}{(s-x)}_{+}^{k-1}{(t-x)}_{+}^{k-1}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x}\hfill \\ \hfill & =\frac{1}{k}{s}^{k-1}{t}^{k-1}{\displaystyle {\int}_{0}^{\mathrm{\infty}}{\left(1-\frac{y}{sk}\right)}_{+}^{k-1}{\left(1-\frac{y}{\mathit{\text{tk}}}\right)}_{+}^{k-1}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}y}\hfill \\ \hfill & \ge \frac{1}{k}{s}^{k-1}{t}^{k-1}{\displaystyle {\int}_{0}^{{v}_{0}(t\mathrm{\Lambda}s)}{e}^{-y/s}{e}^{y/t}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}y\hfill \\ \hfill & =\frac{1}{k}{s}^{k-1}{t}^{k-1}\frac{1}{c}{\displaystyle {\int}_{0}^{{v}_{0}(t\mathrm{\Lambda}s)}c{e}^{-cy}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}y,c\equiv 1/s+1/t\hfill \\ \hfill & =\frac{1}{k}{s}^{k-1}{t}^{k-1}\frac{1}{c}\left(1-\text{exp}(-c(t\mathrm{\Lambda}s){v}_{0})\right)\hfill \\ \hfill & \ge \frac{1}{k}{s}^{k-1}{t}^{k-1}\frac{1}{c}\left(1-\text{exp}(-{v}_{0})\right)\hfill \end{array}$$

as

$$c(s\mathrm{\Lambda}t)=\frac{s+t}{\mathit{\text{st}}}(s\mathrm{\Lambda}t)=\left\{\begin{array}{cc}(t+s)/t,\hfill & s\le t\hfill \\ (t+s)/s,\hfill & s\ge t\hfill \end{array}\right\}\ge 1.$$

But, we also have

$$\frac{1}{c}=\frac{1}{(1/s)+(1/t)}=\frac{\mathit{\text{st}}}{s+t}\ge \frac{1}{2}s\mathrm{\Lambda}t\ge \frac{1}{2}{t}_{0}$$

for *s*, *t* ≥ *t*_{0}, so we conclude that Equation A.1 holds.

From the inequality A.1, we conclude that for measures *μ* concentrated on [*X*_{(1)}, ∞) we have

$$\iint {r}_{k}(s,t)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (s)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t)\ge \frac{(1-{e}^{-{v}_{0}}){X}_{(1)}}{2k}}{\left({\displaystyle {\int}_{0}^{\mathrm{\infty}}{t}^{k-1}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t)}\right)}^{2}.$$

In contrast,

$$\int}_{0}^{\mathrm{\infty}}{s}_{n,k}(t)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t)\le {\displaystyle {\int}_{0}^{\mathrm{\infty}}{t}^{k-1}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t).$$

Combining these two inequalities it follows that for any measure *μ* concentrated on [*X*_{(1)}, ∞) we have

$$\begin{array}{cc}{\mathrm{\Phi}}_{n}(\mu )\hfill & =\frac{1}{2}{\displaystyle \iint {r}_{k}(t,s)}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (s)-{\displaystyle {\int}_{0}^{\mathrm{\infty}}{s}_{n,k}(t)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t)}\hfill \\ \hfill & \ge \frac{(1-{e}^{-{v}_{0}}){X}_{(1)}}{4k}{\left({\displaystyle {\int}_{0}^{\mathrm{\infty}}{t}^{k-1}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t)}\right)}^{2}-{\displaystyle {\int}_{0}^{\mathrm{\infty}}{t}^{k-1}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t)}\hfill \\ \hfill & \equiv A{m}_{k-1}^{2}-{m}_{k-1}.\hfill \end{array}$$

This lower bound is strictly positive if

$${m}_{k-1}>1/A=\frac{4k}{(1-{e}^{-{v}_{0}}){X}_{(1)}}.$$

But for such measures *μ* we can make Φ smaller by taking the zero measure. Thus, we may restrict the minimization problem to the collection of measures *μ* satisfying

$${m}_{k-1}\le 1/A.$$

(A.3)

Now we decompose any measure *μ* on [*X*_{(1)}, ∞) as *μ* = *μ*_{1} + *μ*_{2} where *μ*_{1} is concentrated on [*X*_{(1)}, *MX*_{(n)}] and *μ*_{2} is concentrated on (*MX*_{(n)}, ∞) for some (large) *M* > 0. Then, it follows that

$$\begin{array}{cc}{\mathrm{\Phi}}_{n}(\mu )\hfill & \ge \frac{1}{2}{\displaystyle \iint {r}_{k}(t,s)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mu}_{2}(t)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mu}_{2}(s)-{\displaystyle {\int}_{0}^{\mathrm{\infty}}{t}^{k-1}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t)}}\hfill \\ \hfill & \ge \frac{(1-{e}^{{v}_{0}}){\mathit{\text{MX}}}_{(n)}}{4k}{({\mathit{\text{MX}}}_{(n)})}^{2k-2}\mu {({\mathit{\text{MX}}}_{(n)},\mathrm{\infty})}^{2}-1/A\hfill \\ \hfill & \equiv B\mu {({\mathit{\text{MX}}}_{(n)},\mathrm{\infty})}^{2}-1/A>0\hfill \end{array}$$

if

$$\mu {({\mathit{\text{MX}}}_{(n)},\mathrm{\infty})}^{2}>\frac{1}{\mathit{\text{AB}}}=\frac{4k}{(1-{e}^{-{v}_{0}}){X}_{(1)}}\frac{4k}{(1-{e}^{-{v}_{0}}){({\text{MX}}_{(n)})}^{2k-1}},$$

and hence we can restrict to measures *μ* with

$$\mu ({\mathit{\text{MX}}}_{(n)},\mathrm{\infty})\le \frac{4k}{(1-{e}^{-{v}_{0}}){X}_{(1)}^{1/2}{X}_{(n)}^{k-1/2}}\frac{1}{{M}^{k-1/2}}$$

for every *M* ≥ 1.

But this implies that *μ* satisfies

$${\int}_{0}^{\mathrm{\infty}}{t}^{k-3/4}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\mu (t)\le D$$

for some 0 < *D* = *D*_{ω} < ∞, and this implies that *t*^{k−1} is uniformly integrable over *μ* .

Alternatively, for λ ≥ 1 we have

$$\begin{array}{cc}{\displaystyle {\int}_{t>\lambda}{t}^{k-1}}\mathrm{d}\mu (t)\hfill & ={\lambda}^{k-1}\mu (\lambda ,\mathrm{\infty})+(k-1){\displaystyle {\int}_{\lambda}^{\mathrm{\infty}}{s}^{k-2}}\mu (s,\mathrm{\infty})-\mathrm{d}s\hfill \\ \hfill & \le {\lambda}^{k-1}\frac{K}{{\lambda}^{k-1/2}}+(k-1){\displaystyle {\int}_{\lambda}^{\mathrm{\infty}}{s}^{k-2}}K{s}^{-(k-1/2)}\mathrm{d}s\hfill \\ \hfill & =K{\lambda}^{-1/2}+(k-1)K{\displaystyle {\int}_{\lambda}^{\mathrm{\infty}}{s}^{-3/2}}\mathrm{d}s\hfill \\ \hfill & \le K{\lambda}^{-1/2}+(k-1)2K{\lambda}^{-1/2}\hfill \\ \hfill & \to 0\phantom{\rule{thinmathspace}{0ex}}\text{as}\phantom{\rule{thinmathspace}{0ex}}\lambda \to \mathrm{\infty}\hfill \end{array}$$

uniformly in *μ* .

This implies that for {*μ _{m}*} satisfying

$$\underset{m\to \mathrm{\infty}}{\text{lim}\phantom{\rule{thinmathspace}{0ex}}\text{sup}}{\displaystyle {\int}_{\lambda}^{\mathrm{\infty}}{s}_{n,k}(t)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mu}_{m}(t)}\le {\displaystyle {\int}_{\lambda}^{\mathrm{\infty}}{s}_{n,k}}(t)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mu}_{0}(t),$$

and hence Φ* _{n}* is lower semicontinuous on :

$$\underset{m\to \mathrm{\infty}}{\text{lim}\phantom{\rule{thinmathspace}{0ex}}\text{inf}}{\mathrm{\Phi}}_{n}({\mu}_{m})\ge {\mathrm{\Phi}}_{n}({\mu}_{0}).$$

As *Q _{n}* is lower semi-compact (i.e., the sets

Lemma A.2 characterizes the LSE.

*For k ≥ 1 define* _{n} and _{n} respectively by

$${\mathbb{Y}}_{n}(t)={\displaystyle {\int}_{0}^{t}\phantom{\rule{thinmathspace}{0ex}}{\displaystyle {\int}_{0}^{{t}_{k-1}}\cdots}}{\displaystyle {\int}_{0}^{{t}_{2}}{\mathbb{G}}_{n}({t}_{1})}\mathrm{d}{t}_{1}\mathrm{d}{t}_{2}\cdots \mathrm{d}{t}_{k-1}={\displaystyle {\int}_{0}^{t}\frac{{(t-x)}^{k-1}}{(k-1)!}}\mathrm{d}{\mathbb{G}}_{n}(x)$$

*and*

$${\tilde{H}}_{n}(t)={\displaystyle {\int}_{0}^{t}\phantom{\rule{thinmathspace}{0ex}}{\displaystyle {\int}_{0}^{{t}_{k}}\cdots}}{\displaystyle {\int}_{0}^{{t}_{2}}{\tilde{g}}_{n}}({t}_{1})\mathrm{d}{t}_{1}\mathrm{d}{t}_{2}\cdots \mathrm{d}{t}_{k}={\displaystyle {\int}_{0}^{t}\frac{{(t-x)}^{k-1}}{(k-1)!}}{\tilde{g}}_{n}(x)\mathrm{d}x$$

*for t* ≥ 0. *Then _{n} is the LSE over 𝓜_{k} ∩ L_{2}(λ) if and only if the following conditions are satisfied:*

$$\{\begin{array}{cc}{\tilde{H}}_{n}(t)\ge {\mathbb{Y}}_{n}(t),\hfill & \mathit{\text{for}}\phantom{\rule{thinmathspace}{0ex}}t\ge 0,\phantom{\rule{thinmathspace}{0ex}}\mathit{\text{and}}\hfill \\ {\displaystyle {\int}_{0}^{\mathrm{\infty}}({\tilde{H}}_{n}-{\mathbb{Y}}_{n})\phantom{\rule{thinmathspace}{0ex}}d{\tilde{g}}_{n}^{(k-1)}}=0.\hfill & \hfill \end{array}$$

(A.4)

The arguments are very similar to those used in the proof of lemma 3.

Now, to prove that the LSE is a spline of degree *k* − 1 with simple knots, we need the following intermediate result.

*Let* [*a, b*]] (0, ∞) *and let g be a non-negative and non-increasing function on* [*a, b*]. *For any polynomial P*_{k − 1} *of degree* ≤ *k* − 1 *on* [*a, b*], *if the function*

$$\mathrm{\Delta}(t)={\displaystyle {\int}_{0}^{t}{(t-s)}^{k-1}}g(s)\mathrm{d}s-{P}_{k-1}(s),\phantom{\rule{thinmathspace}{0ex}}t\in [a,b]$$

*admits infinitely many zeros in* [*a, b*], *then there exists* *t*_{0} [*a, b*] *such that g* 0 *on* [*t*_{0}, *b*] *and g* > 0 *on* [*a, t*_{0}) *if* *t*_{0} > *a*.

By applying the mean value theorem *k* times, it follows that (*k* − 1)!*g* = Δ^{(k)} admits infinitely many zeros in [*a*, *b*]. But as *g* is assumed to be non-negative and non-increasing, this implies that if *t*_{0} is the smallest zero of *g* in [*a*, *b*], then *g* 0 on [*t*_{0}, *b*]. By definition of *t*_{0}, *g* > 0 on [*a*, *t*_{0}) if *t*_{0} > *a*.

Now we will use the characterization of the LSE * _{n}* together with the previous proposition to show that it is a finite mixture of beta(1,

Now, note that * _{n}* can be given by the explicit expression:

$${\mathbb{Y}}_{n}(t)=\frac{1}{(k-1)!}\frac{1}{n}{\displaystyle \sum _{j=1}^{n}{(t-{X}_{(j)})}_{+}^{k-1}},\phantom{\rule{thinmathspace}{0ex}}\text{for}\phantom{\rule{thinmathspace}{0ex}}t>0.$$

In other words, * _{n}* is a spline of degree

There exists *m* \{0}, *ã*_{1}, , *ã _{m}* and

$${\tilde{g}}_{n}(x)={\tilde{w}}_{1}\frac{k{({\tilde{a}}_{1}-x)}_{+}^{k-1}}{{\tilde{a}}_{1}^{k}}+\cdots +{\tilde{w}}_{m}\frac{k{({\tilde{a}}_{m}-x)}_{+}^{k-1}}{{\tilde{a}}_{m}^{k}}.$$

(A.5)

Consequently, the equality part in Equation A.4 can be re-expressed as * _{n}*(

We need to consider two cases:

- The number of zeros of
=_{n}−_{n}is finite. This implies by the equality condition in Equation A.4 that the number of points of increase of ${(-1)}^{k-1}{\tilde{g}}_{n}^{(k-1)}$ is also finite. Therefore, ${(-1)}^{k-1}{\tilde{g}}_{n}^{(k-1)}$ is discrete with finitely many jumps and hence_{n}is of the form given in Equation A.5._{n} - Now, suppose that
has infinitely many zeros. Let_{n}*j*be the smallest integer in {0, ,*n*− 1} such that [*X*_{(j)},*X*_{(j + 1)}] contains infinitely many zeros of(with_{n,k}*X*_{(0)}= 0 and*X*_{(n + 1)}= ∞). By proposition A.2, if*t*is the smallest zero of_{j}in [_{n}*X*_{(j)},*X*_{(j + 1)}], then0 on [_{n}*t*,_{j}*X*_{(j + 1)}] and> 0 on [_{n}*X*_{(j)},*t*) if_{j}*t*>_{j}*X*_{(j)}. Note that from the proof of proposition A.1, we know that the minimizing measuredoes not put any mass on (0,_{n}*X*_{(1)}], and hence the integer*j*has to be strictly greater than 0.

Now, by definition of *j*, * _{n}* has finitely many zeros to the left of

*ĝ* = *g*_{0} (proof of Proposition 2). For 0 < α < 1 define${\eta}_{\alpha}={G}_{0}^{-1}(1-\alpha )$. Let ε > 0 be small so that ε < η_{ε}.

By Equation 6, there exists a number *D*_{ε} > 0 such that *ĝ _{l}* (η

$$1\ge {\displaystyle {\int}_{0}^{\mathrm{\infty}}\frac{{g}_{0}(x)}{{\widehat{g}}_{l}(x)}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{l}(x)\ge {\displaystyle {\int}_{{\eta}_{\in}}^{\mathrm{\infty}}{g}_{0}(x){\widehat{g}}_{l}(x)}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{l}(x)\ge \frac{1}{{\widehat{g}}_{l}({\eta}_{\in})}{\displaystyle {\int}_{{\eta}_{\in}}^{\mathrm{\infty}}{g}_{0}(x)}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{l}(x),$$

and hence

$$\underset{l}{\text{lim}\phantom{\rule{thinmathspace}{0ex}}\text{inf}}{\widehat{g}}_{l}({\eta}_{\epsilon})\ge \underset{l}{\text{lim}\phantom{\rule{thinmathspace}{0ex}}\text{inf}}{\displaystyle {\int}_{{\eta}_{\epsilon}}^{\mathrm{\infty}}{g}_{0}}(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{l}(x)={\displaystyle {\int}_{{\eta}_{\epsilon}}^{\mathrm{\infty}}{g}_{0}}(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{G}_{0}(x)>0,$$

by the choice of η_{ε}, and the claim follows by taking ${D}_{\epsilon}={\displaystyle {\int}_{{\eta}_{\epsilon}}^{\mathrm{\infty}}{g}_{0}(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{G}_{0}(x)/2}$. Hence, by the bound in lemma 4, we have

$${\widehat{g}}_{l}(z)\le \frac{1}{z}{\left(1-\frac{1}{k}\right)}^{k-1}\equiv \frac{{e}_{k}}{z},\phantom{\rule{thinmathspace}{0ex}}{g}_{0}(z)\le \frac{1}{z}{\left(1-\frac{1}{k}\right)}^{k-1}\equiv \frac{{e}_{k}}{z}.$$

It follows that *g*_{0}/*ĝ _{l}* is uniformly bounded on the interval [ε,η

$${\underset{\xaf}{c}}_{\epsilon}\le \frac{{g}_{0}(x)}{{\widehat{g}}_{l}(x)}\le {\overline{c}}_{\epsilon}.$$

In fact,

$$\frac{{g}_{0}(x)}{{\widehat{g}}_{l}(x)}\le \frac{{g}_{0}(\epsilon )}{{\widehat{g}}_{l}({\eta}_{\epsilon})}\le \frac{{\epsilon}^{-1}{e}_{k}}{{D}_{\epsilon}},$$

while

$$\frac{{g}_{0}(x)}{{\widehat{g}}_{l}(x)}\ge \frac{{g}_{0}({\eta}_{\epsilon})}{{\widehat{g}}_{l}(\epsilon )}\ge \frac{{g}_{0}({\eta}_{\epsilon})}{{\epsilon}^{-1}{e}_{k}}$$

Therefore,

$$\frac{{g}_{0}(x)}{{\widehat{g}}_{l}(x)}\to \frac{{g}_{0}(x)}{\widehat{g}(x)}$$

uniformly on [ε, η_{ε}]. Using Equation 6, we have for sufficiently large l and

$${\int}_{\epsilon}^{{\eta}_{\epsilon}}\phantom{\rule{thinmathspace}{0ex}}\frac{{g}_{0}(x)}{\widehat{g}(x)}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{l}(x)\le {\displaystyle {\int}_{\epsilon}^{{\eta}_{\epsilon}}\phantom{\rule{thinmathspace}{0ex}}\left(\frac{{g}_{0}(x)}{{\widehat{g}}_{l}(x)}+\epsilon \right)}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{l}(x)\le 1+\epsilon .$$

But as * _{l}* converges weakly to

$${\int}_{\epsilon}^{{\eta}_{\epsilon}}\phantom{\rule{thinmathspace}{0ex}}\frac{{g}_{0}(x)}{\widehat{g}(x)}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{G}_{0}(x)\le 1+\epsilon .$$

Now, by Lebesgue’s monotone convergence theorem, we conclude that

$${\int}_{0}^{\mathrm{\infty}}\frac{{g}_{0}(x)}{\widehat{g}(x)}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{G}_{0}(x)\le 1,$$

which is equivalent to

$${\int}_{0}^{\mathrm{\infty}}\phantom{\rule{thinmathspace}{0ex}}\frac{{g}_{0}^{2}(x)}{\widehat{g}(x)}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x\le 1.$$

(A.6)

Define $\tau ={\displaystyle {\int}_{0}^{\mathrm{\infty}}\widehat{g}(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x}$. Then, *ĥ* = τ^{−1}*ĝ* is a *k*-monotone density. By Equation A.6, we have that

$${\int}_{0}^{\mathrm{\infty}}\frac{{g}_{0}^{2}(x)}{\widehat{h}(x)}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x=\tau {\displaystyle {\int}_{0}^{\mathrm{\infty}}\frac{{g}_{0}^{2}(x)}{\widehat{g}(x)}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x\le \tau .$$

Now, consider the function

$$\mathit{K}(g)={\displaystyle {\int}_{0}^{\mathrm{\infty}}\frac{{g}_{0}^{2}(x)}{g(x)}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x$$

defined on the class * _{d}* of all continuous densities

$${\int}_{0}^{\mathrm{\infty}}\phantom{\rule{thinmathspace}{0ex}}\left(\frac{{g}_{0}^{2}(x)}{g(x)}+g(x)\right)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x.$$

It is easy to see that the integrand is minimized pointwise by taking *g*(*x*) = *g*_{0}(*x*).Hence, inf _{d} * K*(

Now, if *g* ≠ *g*_{0} at a point *x*, it follows that *g* ≠ *g*_{0} on an interval of positive length. Hence, *g*_{0} ≠ *g* * K*(

Fix *c* > 0 and suppose that the true *k*-monotone density *g*_{0} satisfies$\int}_{0}^{\mathrm{\infty}}{x}^{-1/2}{g}_{0}(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x<\mathrm{\infty$. Then, ‖* _{n}* −

$$\underset{x\in [c,\mathrm{\infty})}{\text{sup}}|{\tilde{g}}_{n}^{(j)}(x)-{g}_{0}^{(j)}(x)|{\to}_{a.s.}0,\text{\hspace{1em}as}\phantom{\rule{thinmathspace}{0ex}}n\to \mathrm{\infty},$$

for *j* = 0, 1, , *k* − 2, and, for each *x* > 0 at which *g*_{0} is (*k* − 1)-times differentiable, ${\tilde{g}}_{n}^{(k-1)}(x){\to}_{a.s.}{g}_{0}^{(k-1)}(x)$. Here, ‖ · ‖_{2} denotes the
*L*_{2}-norm.

The main difficulty here is that the LSE * _{n}* is not necessarily a density in that it may integrate to more than one; indeed it can be shown that ${\int}_{0}^{\mathrm{\infty}}{\tilde{g}}_{1}(x)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x=((2k-2)/k){(1-1/(2k-1))}^{k-2}>1$ for

Here we will first show that ${\int}_{0}^{\mathrm{\infty}}{\tilde{g}}_{n}^{2}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\lambda ={O}_{p}(1)$. Note that the equality part in Equation (A.4) can be re-written as $\int}_{0}^{\mathrm{\infty}}{\tilde{g}}_{n}^{2}(x)\mathrm{d}x={\displaystyle {\int}_{0}^{\mathrm{\infty}}{\tilde{g}}_{n}(x)\mathrm{d}{\mathbb{G}}_{n}(x)$ and hence

$$\sqrt{{\displaystyle {\int}_{0}^{\mathrm{\infty}}{\tilde{g}}_{n}^{2}(x)\mathrm{d}x}}={\displaystyle {\int}_{0}^{\mathrm{\infty}}{\tilde{u}}_{n}(x)}\mathrm{d}{\mathbb{G}}_{n}(x),$$

(A.7)

where *ũ _{n}*

$${\mathcal{F}}_{k}=\left\{g\in {\U0001d4dc}_{k},{\displaystyle {\int}_{0}^{\mathrm{\infty}}{g}^{2}\mathrm{d}\lambda =1}\right\}.$$

In the following, we show that * _{k}* has an envelope

$$1={\displaystyle {\int}_{0}^{\mathrm{\infty}}{g}^{2}\mathrm{d}\lambda \ge}{\displaystyle {\int}_{0}^{x}{g}^{2}\mathrm{d}\lambda \ge x{g}^{2}(x)},$$

as *g* is decreasing. Therefore, *g*(*x*) ≤ 1/√*x* *G*(*x*) for all *x* > 0 and *g* * _{k}*; that is

$${\int}_{0}^{\mathrm{\infty}}{\tilde{u}}_{n}(x)}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{n}(x)\le {\displaystyle {\int}_{0}^{\mathrm{\infty}}G(x)}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{\mathbb{G}}_{n}(x){\to}_{a.s.}{\displaystyle {\int}_{0}^{\mathrm{\infty}}G(x)}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}{G}_{0}(x),\text{\hspace{1em}as}\phantom{\rule{thinmathspace}{0ex}}n\to \mathrm{\infty$$

and hence by Equation A.7 the integral
${\int}_{0}^{\mathrm{\infty}}{\tilde{g}}_{n}^{2}\mathrm{d}\lambda$ is bounded (almost surely) by some constant *M _{k}*.

Now we are ready to complete the proof. Let δ > 0 and τ* _{n}* be the last jump point of ${\tilde{g}}_{n}^{(k-1)}$ if there are jump points in the interval (0, δ]; otherwise, we take τ

- τ
≥ δ/2. Let_{n}*n*be large enough so that$\int}_{0}^{\mathrm{\infty}}{\tilde{g}}_{n}^{2}\mathrm{d}\lambda \le {M}_{k$. We have$$\begin{array}{cc}{\tilde{g}}_{n}({\tau}_{n})\hfill & \le {\tilde{g}}_{n}(\delta /2)=(2/\delta )(\delta /2){\tilde{g}}_{n}(\delta /2)\le (2/\delta ){\displaystyle {\int}_{0}^{\delta /2}{\tilde{g}}_{n}(x)\mathrm{d}x}\hfill \\ \hfill & \le (2/\delta )\sqrt{\delta /2}\sqrt{{\displaystyle {\int}_{0}^{\delta /2}{\tilde{g}}_{n}^{2}(x)\mathrm{d}x}}\le \sqrt{2/\delta}\sqrt{{\displaystyle {\int}_{0}^{\mathrm{\infty}}{\tilde{g}}_{n}^{2}(x)\mathrm{d}x}}\hfill \\ \hfill & =\sqrt{2{M}_{k}/\delta}.\hfill \end{array}$$(A.8) - τ
< δ/2. We have_{n}$${\int}_{{\tau}_{n}}^{\delta}{\tilde{g}}_{n}(x)\mathrm{d}x\le \sqrt{\delta -{\tau}_{n}}}\sqrt{{\displaystyle {\int}_{{\tau}_{n}}^{\delta}{\tilde{g}}_{n}^{2}(x)\mathrm{d}x}}\le \sqrt{\delta}\sqrt{{\displaystyle {\int}_{0}^{\mathrm{\infty}}{\tilde{g}}_{n}^{2}(x)\mathrm{d}x}}=\sqrt{\delta {M}_{k}}.$$Using thatis a polynomial of degree_{n}*k*− 1 on the interval [τ, δ], we have_{n}and hence${\tilde{g}}_{n}({\tau}_{n})\le 2k\sqrt{{M}_{k}/\delta}$. By combining the bounds, we have for large$n,{\tilde{g}}_{n}({\tau}_{n})\le 2k\sqrt{{M}_{k}/\delta}={C}_{k}$. Now, as$$\begin{array}{cc}\sqrt{\delta {M}_{k}}\hfill & \ge {\displaystyle {\int}_{{\tau}_{n}}^{\delta}{\tilde{g}}_{n}(x)\mathrm{d}x}={\tilde{g}}_{n}(\delta )(\delta -{\tau}_{n})-\frac{{\tilde{g}}_{n}^{\prime}(\delta )}{2}{(\delta -{\tau}_{n})}^{2}+\cdots +{(-1)}^{k-1}\frac{{\tilde{g}}_{n}^{(k-1)}(\delta )}{k!}{(\delta -{\tau}_{n})}^{k}\hfill \\ \hfill & \ge (\delta -{\tau}_{n})\phantom{\rule{thinmathspace}{0ex}}\left({\tilde{g}}_{n}(\delta )+\frac{1}{k}(-1){\tilde{g}}_{n}^{\prime}(\delta )(\delta -{\tau}_{n})+\cdots +{(-1)}^{k-1}\frac{{\tilde{g}}_{n}^{(k-1)}(\delta )}{(k-1)!}{(\delta -{\tau}_{n})}^{k-1}\right)\hfill \\ \hfill & =(\delta -{\tau}_{n})\phantom{\rule{thinmathspace}{0ex}}\left({\tilde{g}}_{n}(\delta )\phantom{\rule{thinmathspace}{0ex}}\left(1-\frac{1}{k}\right)+\frac{1}{k}{\tilde{g}}_{n}({\tau}_{n})\right)\ge \frac{\delta}{2k}{\tilde{g}}_{n}({\tau}_{n})\hfill \end{array}$$(δ) ≤_{n}(τ_{n}), the sequence_{n}(_{n}*x*) is uniformly bounded almost surely for all*x*≥ δ. Using a Cantor diagonalization argument, we can find a subsequence {*n*} so that, for each_{l}*x*≥ δ,*g*(x) → (_{nl}*x*), as*l*→ ∞. By Fatou’s lemma, we have$${\int}_{\delta}^{\mathrm{\infty}}{(\tilde{g}(x)-{g}_{0}(x))}^{2}}\mathrm{d}x\le \underset{l\to \mathrm{\infty}}{\text{lim}\phantom{\rule{thinmathspace}{0ex}}\text{inf}}{\displaystyle {\int}_{\delta}^{\mathrm{\infty}}{({\tilde{g}}_{{n}_{l}}(x)-{g}_{0}(x))}^{2}}\mathrm{d}x.$$(A.9)However, the characterization ofimplies that_{n}*Q*(_{n}) ≤_{n}*Q*(_{n}*g*_{0}), and this yields$${\int}_{0}^{\mathrm{\infty}}{({\tilde{g}}_{n}(x)-{g}_{0}(x))}^{2}}\mathrm{d}x\le 2{\displaystyle {\int}_{0}^{\mathrm{\infty}}({\tilde{g}}_{n}(x)-{g}_{0}(x))}\mathrm{d}({\mathbb{G}}_{n}(x)-{G}_{0}(x)).$$Thus, we can writeas$$\begin{array}{c}{\displaystyle {\int}_{\delta}^{\mathrm{\infty}}{({\tilde{g}}_{{n}_{l}}(x)-{g}_{0}(x))}^{2}}\mathrm{d}x\le {\displaystyle {\int}_{0}^{\mathrm{\infty}}{({\tilde{g}}_{{n}_{l}}(x)-{g}_{0}(x))}^{2}}\mathrm{d}x\hfill \\ \le 2{\displaystyle {\int}_{\delta}^{\mathrm{\infty}}({\tilde{g}}_{{n}_{l}}(x)-{g}_{0}(x))}\mathrm{d}({\mathbb{G}}_{{n}_{l}}(x)-{G}_{0}(x)){\to}_{a.s.}0,\hfill \end{array}$$(A.10)*l*→ ∞. The last convergence is justified as follows: as ${\int}_{0}^{\mathrm{\infty}}{\tilde{g}}_{{n}_{l}}^{2}d\lambda$ is bounded almost surely, we can find a constant*C*> 0 such that−_{nl}*g*_{0}admits$G(x)=Cl\sqrt{x},x>0$, as an envelope. Since*G**L*_{1}(*G*_{0}) by hypothesis and as the class of functions {(*g*−*g*_{0})1_{[G≤M]}:*g*𝓜∩_{k}*L*_{2}(λ)} is a Glivenko-Cantelli class for every*M*> 0 (each element is a difference of two bounded monotone functions) Equation A.10 holds. From Equation A.9, we conclude that ${\int}_{\delta}^{\mathrm{\infty}}{(\tilde{g}(x)-{g}_{0}(x))}^{2}\mathrm{d}x\le 0$, and therefore,*g*_{0}on (0, ∞) as δ > 0 can be chosen arbitrarily small. We have proved that there exists Ω_{0}with*P*(Ω_{0}) = 1 and such that for each ω Ω_{0}and any given subsequence(·, ω), we can extract a further subsequence_{nk}(·, ω) that converges to_{nl}*g*_{0}on (0, ∞). It follows thatconverges to_{n}*g*_{0}on (0, ∞), and this convergence is uniform on intervals of the form [*c*, ∞),*c*> 0 by the monotonicity and continuity of*g*_{0}. As for the MLE, consistency of the higher derivatives can be shown recursively using the convexity of ${(-1)}^{j}{\tilde{g}}_{n}^{(j)}$ for*j*=1, …,*k*− 2.

Let *μ* be a positive number and consider the function * _{μ}* defined by

$${\tilde{g}}_{\mu}(x)={({x}_{0}+\mu -x)}^{k+1}{(x-{x}_{0}+\mu )}^{k+2}{1}_{[{x}_{0}-\mu ,{x}_{0}+\mu ]}(x).$$

Now, consider the perturbation

$${g}_{\mu}(x)={g}_{0}(x)+s(\mu ){\tilde{g}}_{\mu}(x),\phantom{\rule{thinmathspace}{0ex}}x\in (0,\mathrm{\infty}),$$

where *s*(*μ*) is a scale to be determined later. If *μ* is chosen small enough so that the true density *g*_{0} is *k*-times continuously differentiable on [*x*_{0} − *μ*, *x*_{0} + *μ*], the perturbed function *g _{μ}* is also

$$r(x)={(1-x)}^{k+1}{(1+x)}^{k+2}{1}_{[-1,1]}(x)={(1-{x}^{2})}^{k+1}(1+x){1}_{[-1,1]}(x).$$

Then, we can write * _{μ}* as

$${g}_{\mu}^{(j)}({x}_{0})-{g}_{0}^{(j)}({x}_{0})=s(\mu ){\mu}^{2k+3-j}{r}^{(j)}(0).$$

The scale *s*(*μ*) should be chosen so that
${(-1)}^{j}{g}_{\mu}^{(j)}(x)>0$ for all 0 ≤*j* ≤*k*, for *x* [*x*_{0} − *μ*, *x*_{0} + *μ*]. But for *μ* small enough, the sign of
${(-1)}^{j}{g}_{\mu}^{(j)}$ will be that of ${(-1)}^{j}{g}_{0}^{(j)}({x}_{0})$, and hence *g _{μ}* is

$${g}_{\mu}^{(j)}({x}_{0})={g}_{0}^{(j)}({x}_{0})+{\mu}^{k-j}\frac{{g}_{0}^{(k)}({x}_{0}){r}^{(j)}(0)}{{r}^{(k)}(0)}={g}_{0}^{(j)}({x}_{0})+o(\mu ),$$

as *μ* → 0, and

$${(-1)}^{k}{g}_{\mu}^{(k)}({x}_{0})=2{(-1)}^{k}{g}_{0}^{(k)}({x}_{0})>0.$$

To compute *r*^{(j)}(0), note that for *m* ≥ 2 and 2*n* ≥ *m* we have

$$\begin{array}{cc}{\left({(1-{x}^{2})}^{n}\right)}^{(m)}\hfill & ={(({(1-{x}^{2})}^{n})\prime )}^{(m-1)}\hfill \\ \hfill & ={(-2nx{(1-{x}^{2})}^{n-1})}^{(m-1)}\hfill \\ \hfill & =-2n\phantom{\rule{thinmathspace}{0ex}}(x{({(1-{x}^{2})}^{n-1})}^{(m-1)}+(m-1){({(1-{x}^{2})}^{n-1})}^{m-2}),\hfill \end{array}$$

where in the last equality we used Leibniz’s formula for the derivatives of a product; see, for example, Apostol (1957, p. 99). Evaluating the last expression at *x* = 0 yields

$${x}_{n,m}\equiv {\left({(1-{x}^{2})}^{n}\right)}^{(m)}{|}_{x=0}=-2n(m-1){x}_{n-1,m-2}.$$

If *m* is even, we find that

$$\begin{array}{cc}{x}_{n,m}\hfill & ={(-2)}^{m/2}{\displaystyle \prod _{i=0}^{m/2-1}(n-i)}\phantom{\rule{thinmathspace}{0ex}}\times {\displaystyle \prod _{i=0}^{m/2-1}(m-2i-1)}\phantom{\rule{thinmathspace}{0ex}}\times {x}_{n-m/2,0}\hfill \\ \hfill & ={(-2)}^{m/2}{\displaystyle \prod _{i=0}^{m/2-1}(n-i)}\phantom{\rule{thinmathspace}{0ex}}\times {\displaystyle \prod _{i=0}^{m/2-1}(m-2i-1)}\hfill \end{array}$$

as *x*_{n − m/2,0} = 1. Similarly, when *m* is odd,

$$\begin{array}{cc}{x}_{n,m}\hfill & ={(-2)}^{(m-1)/2}{\displaystyle \prod _{i=0}^{(m-1)/2-1}(n-i)}\phantom{\rule{thinmathspace}{0ex}}\times {\displaystyle \prod _{i=0}^{(m-1)/2-1}(m-2i-1)}\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{x}_{n-(m-1)/2,1}\hfill \\ \hfill & =0\hfill \end{array}$$

as *x*_{n − (m−1)/2,1} = 0. Now we have, for 1 ≤*j* ≤*k*,

$$\begin{array}{cc}{r}^{(j)}(x)\hfill & ={({(1-{x}^{2})}^{k+1}(1+x))}^{(j)}\hfill \\ \hfill & =(x+1)\phantom{\rule{thinmathspace}{0ex}}{\left({(1-{x}^{2})}^{k+1}\right)}^{(j)}+j\phantom{\rule{thinmathspace}{0ex}}{\left({(1-{x}^{2})}^{k+1}\right)}^{(j-1)}\hfill \end{array}$$

and hence

$${r}^{(j)}(0)={\left({(1-{x}^{2})}^{k+1}\right)}^{(j)}{|}_{x=0}+j\phantom{\rule{thinmathspace}{0ex}}{\left({(1-{x}^{2})}^{k+1}\right)}^{(j-1)}{|}_{x=0}.$$

Therefore, when *j* is even the second term vanishes and

$${r}^{(j)}(0)={(-2)}^{j/2}{\displaystyle \prod _{i=0}^{j/2-1}(k+1-i)}\phantom{\rule{thinmathspace}{0ex}}\times {\displaystyle \prod _{j=0}^{j/2-1}(j-2i-1)}\ne 0.$$

When *j* is odd, the first term vanishes and

$${r}^{(j)}(0)={(-2)}^{(j-1)/2}{\displaystyle \prod _{i=0}^{(j-1)/2-1}(k+1-i)}\phantom{\rule{thinmathspace}{0ex}}\times {\displaystyle \prod _{i=0}^{(j-1)/2}(j-2i)}\ne 0.$$

Summarizing, we have shown that

$${r}^{(j)}(0)=\{\begin{array}{cc}{(-2)}^{j/2}{\displaystyle {\prod}_{i=0}^{j/2-1}(k+1-i)\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{\displaystyle {\prod}_{i=0}^{j/2-1}(j-2i-1)\ne 0}}\hfill & j\phantom{\rule{thinmathspace}{0ex}}\text{even}\hfill \\ {(-2)}^{(j-1)/2}{\displaystyle {\prod}_{i=0}^{(j-1)/2-1}(k+1-i)\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{\displaystyle {\prod}_{i=0}^{(j-1)/2}(j-2i)\ne 0}}\hfill & j\phantom{\rule{thinmathspace}{0ex}}\text{odd}.\hfill \end{array}$$

We set *C _{k,j}* =

$${C}_{k,k}=\{\begin{array}{cc}{(-2)}^{k/2}{\displaystyle {\prod}_{i=0}^{k/2-1}(k+1-i)\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{\displaystyle {\prod}_{i=0}^{k/2-1}(k-2i-1)}}\hfill & \text{if}\phantom{\rule{thinmathspace}{0ex}}k\phantom{\rule{thinmathspace}{0ex}}\text{is even}\hfill \\ {(-2)}^{(k-1)/2}{\displaystyle {\prod}_{i=0}^{(k-1)/2-1}(k+1-i)\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{\displaystyle {\prod}_{i=0}^{(k-1)/2}(k-2i)}}\hfill & \text{if}\phantom{\rule{thinmathspace}{0ex}}k\phantom{\rule{thinmathspace}{0ex}}\text{is odd}.\hfill \end{array}$$

The previous expressions can be given in a more compact form. After some algebra, we find that

$${C}_{k,k}=\{\begin{array}{cc}2\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{(-1)}^{k/2}(k+1)(k-1)!\left(\begin{array}{c}k\\ k/2-1\end{array}\right)\hfill & \text{if}\phantom{\rule{thinmathspace}{0ex}}k\phantom{\rule{thinmathspace}{0ex}}\text{is}\phantom{\rule{thinmathspace}{0ex}}\text{even}\hfill \\ {(-1)}^{(k-1)/2}k!\left(\begin{array}{c}k+1\\ (k-1)/2\end{array}\right)\hfill & \text{if}\phantom{\rule{thinmathspace}{0ex}}k\phantom{\rule{thinmathspace}{0ex}}\text{is}\phantom{\rule{thinmathspace}{0ex}}\text{odd}.\hfill \end{array}$$

(A.11)

We have for 0 ≤*j* ≤*k* − 1,

$$|{T}_{j}({g}_{\mu})-{T}_{j}({g}_{0})|=\left|\frac{{C}_{k,j}}{{C}_{k,k}}{g}_{0}^{(k)}({x}_{0})\right|{\mu}^{k-j}\equiv {\lambda}_{k,1}^{(j)}|{g}_{0}^{(k)}({x}_{0})|{\mu}^{k-j},$$

where we defined ${\lambda}_{k,1}^{(j)}=|{C}_{k,j}/{C}_{k,k}|$ for *j* {0, …, *k* − 1}. Furthermore, by computation and change of variables,

$${\int}_{0}^{\mathrm{\infty}}\frac{{({g}_{\mu}(x)-{g}_{0}(x))}^{2}}{{g}_{0}(x)}}\mathrm{d}x=\left(\frac{{({g}_{0}^{(k)}({x}_{0}))}^{2}}{{g}_{0}({x}_{0})}\frac{{\displaystyle {\int}_{-1}^{1}{(1-{z}^{2})}^{2(k+1)}{(z+1)}^{2}\mathrm{d}z}}{{({C}_{k,k})}^{2}}\right)\phantom{\rule{thinmathspace}{0ex}}{\mu}^{2k+1}+o({\mu}^{2k+2})$$

as *μ* 0. This gives control of the Hellinger distance as well in view of Jongbloed (2000, lemma 2, p. 282), or Jongbloed (1995, corollary 3.2, pp. 30–31). We set

$$\begin{array}{cc}{\lambda}_{k,2}\hfill & =\frac{{\displaystyle {\int}_{-1}^{1}{(1-{z}^{2})}^{2(k+1)}{(z+1)}^{2}\mathrm{d}z}}{{({C}_{k,k})}^{2}}\hfill \\ \hfill & =\{\begin{array}{cc}{2}^{4(k+1)}\frac{(2k+3)(k+2)}{{(k+1)}^{2}}\frac{{((2(k+1)))}^{2}}{(4k+7){((k-1))}^{2}{\left((\begin{array}{c}k\\ k/2-1\end{array}/\right)}^{2}}\hfill & k\phantom{\rule{thinmathspace}{0ex}}\text{even}\hfill \\ {2}^{4(k+2)}(2k+3)(k+2)\frac{{((2(k+1)))}^{2}}{(4k+7){(k)}^{2}{\left((\begin{array}{c}k+1\\ (k-1)/2\end{array}/\right)}^{2}}\hfill & k\phantom{\rule{thinmathspace}{0ex}}\text{odd}.\hfill \end{array}\hfill \end{array}$$

Now, by using the change of variable ε = *μ*^{2k + 1}(*b _{k}* +

$${b}_{k}={\lambda}_{k,2}{({g}_{0}^{(k)}({x}_{0}))}^{2}/{g}_{0}({x}_{0}))$$

so that *μ* = (ε/*b _{k}*)

$${m}_{j}(\epsilon )\ge {\lambda}_{k,1}^{(j)}{g}_{0}^{(k)}({x}_{0}){\left(\frac{\epsilon}{{b}_{k}}\right)}^{(k-j)/(2k+1)}(1+o(1)).$$

The result is that *m _{j}* (ε) ≥ (

$$\begin{array}{c}\underset{\tau >0}{\text{sup}}\phantom{\rule{thinmathspace}{0ex}}\underset{n\to \mathrm{\infty}}{\text{lim}}\phantom{\rule{thinmathspace}{0ex}}\text{inf}\phantom{\rule{thinmathspace}{0ex}}{n}^{(k-j)/(2k+1)}{\text{MMR}}_{1}(n,{T}_{j},{\mathcal{D}}_{k,n,\tau})\hfill \\ \ge \frac{1}{4}{\left(4\frac{k-j}{2k+1}{e}^{-1}\right)}^{(k-j)/(2k+1)}{({r}_{k,j})}^{(k-j)/(2k+1)},\hfill \end{array}$$

(A.12)

which can be rewritten as

$$\begin{array}{c}\underset{\tau >0}{\text{sup}}\phantom{\rule{thinmathspace}{0ex}}\underset{n\to \mathrm{\infty}}{\text{lim}}\phantom{\rule{thinmathspace}{0ex}}\text{inf}\phantom{\rule{thinmathspace}{0ex}}{n}^{(k-j)/(2k+1)}{\text{MMR}}_{1}(n,{T}_{j},{\mathcal{D}}_{k,n,\tau})\hfill \\ \ge \frac{1}{4}{\left(4\frac{k-j}{2k+1}{e}^{-1}\right)}^{(k-j)/(2k+1)}\frac{{\lambda}_{k,1}^{(j)}}{{({\lambda}_{k,2})}^{(k-j)/(2k+1)}}\left\{|{g}_{0}^{(k)}({x}_{0}){|}^{(2j+1)/(2k+1)}{g}_{0}{({x}_{0})}^{(k-j)/(2k+1)}\right\}\hfill \end{array}$$

for *j* = 0, , *k* − 1. Finally, note that the fact that the function *g _{μ}* is not exactly a density will not affect the obtained constants its integral converges to 1 as

Let ${G}_{\mu}(x)={\displaystyle {\int}_{0}^{x}{g}_{\mu}(t)\mathrm{d}t}$. Using the inversion formula in Equation 2, we have

$$T({g}_{\mu})-T({g}_{0})={G}_{\mu}({x}_{0})-{G}_{0}({x}_{0})+{\displaystyle \sum _{j=1}^{k}{(-1)}^{j}}\frac{{x}_{0}^{j}}{j!}({T}_{j-1}({g}_{\mu})-{T}_{j-1}({g}_{0})).$$

For *j* = 1, …, *k*, we have already established before that$|{T}_{j-1}({g}_{\mu})-{T}_{j-1}({g}_{0})|={\lambda}_{k,1}^{(j)}|{g}_{0}^{(k)}({x}_{0})|{\mu}^{k-j}$. In constrast, we have for *μ* > 0 small enough

$$\begin{array}{cc}{G}_{\mu}({x}_{0})\hfill & ={G}_{0}({x}_{0})+s(\mu ){\displaystyle {\int}_{{x}_{0}-\mu}^{{x}_{0}-\mu}{({x}_{0}+\mu -x)}^{k+1}{(x-{x}_{0}+\mu )}^{k+2}\mathrm{d}x}\hfill \\ & ={G}_{0}({x}_{0})+s(\mu ){\displaystyle {\int}_{{x}_{0}-\mu}^{{x}_{0}+\mu}{(x-{x}_{0}+\mu )}^{k+1}{(x-{x}_{0}-\mu )}^{k+1}\mathrm{d}x}\hfill \\ \hfill & ={G}_{0}({x}_{0})+s(\mu ){\displaystyle {\int}_{-1}^{1}r(x)\mathrm{d}x{\mu}^{2k+4}}\hfill \\ & ={G}_{0}({x}_{0})+\frac{{g}_{0}^{(k)}({x}_{0})}{{\mu}^{k+3}{r}^{(k)}(0)}\left({\displaystyle {\int}_{-1}^{1}r(x)\mathrm{d}x}\right)\phantom{\rule{thinmathspace}{0ex}}{\mu}^{2k+4}\hfill \\ & ={G}_{0}({x}_{0})+O({\mu}^{k+1}).\hfill \end{array}$$

Hence,

$$|T({g}_{\mu})-T({g}_{0})|=\frac{{x}_{0}^{k}}{k!}{\lambda}_{k,1}^{(k-1)}|{g}_{0}^{(k)}({x}_{0})|\mu +o(\mu ).$$

Using again the change of variable ε = *μ*^{2k + 1} (*b _{k}* +

Fadoua Balabdaoui, CEREMADE, Université Paris-Dauphine, Place du Maréchal de Lattre de Tassigny, 75775, Paris, CEDEX 16, France.

Jon A. Wellner, Department of Statistics, University of Washington, Box 354322, Seattle, WA 98195-4322, USA.

- Anevski D. Estimating the derivative of a convex density. University of Lund: Department of Mathematical Statistics; 1994. Technical Report 1994:8.
- Anevski D. Estimating the derivative of a convex density. Statistica Neerlandica. 2003;57:245–257.
- Apostol TM. Mathematical analysis: a modern approach to advanced calculus. Reading, MA: Addision-Wesley Publishing Company, Inc.; 1957.
- Ayer M, Brunk HD, Ewing GM, Reid WT, Silverman E. An empirical distribution function for sampling with incomplete information. The Annals of Mathematcial Statistics. 1955;26:641–647.
- Balabdaoui F. Consistent estimation of a convex density at the origin. Mathematical Methods of Statistics. 2007;16:77–95.
- Balabdaoui F, Wellner JA. Estimation of a K-monotone density, part 2: algorithm for computation and numerical results. University of Washington: Department of Statistics; 2004. Technical Report 460.
- Balabdaoui F, Wellner JA. Estimation of a K-monotone density: Limit distribution theory and the spline connection. The Annals of Statistics. 2007;35:2536–2564.
- Brunk HD. Probability theory and elements of measure theory. London: Academic Press Inc. [Harcourt Brace Jovanovich Publisher]; 1981. Second edition of the translation by R. B. Burckel from the third German edition, Probability and Mathematical Statistics.
- Brunk HD. On the estimation of parameters restricted by inequalities. Annals of Mathematical Statistics. 1958;29:437–454.
- De Boor C. A practical guide to splines, vol. 27 of Applied Mathematical Science. New York: Springer-Verlag; 1978.
- DeVore RA, Lorentz GG. Constructive approximation, vol. 303 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] Berlin: Springer-Verlag; 1993.
- Donoho DL, Liu RC. Geometrizing rates of convergence. i. University of Califonia, Berkeley: Department of Statistics; 1987. Technical Report. 137.
- Donoho DL, Liu RC. Geometrizing rates of convergence. II III. The Annals of Statistics. 1991;19:633–667. 668–701.
- van Eeden C. Maximum likelihood estimation of partially or completely ordered parameters. I, Koninklijke Nederlandse Akademie van Wetenschappen Proceedings. Series A Mathematical Sciences. 1957a;19:128–136.
- van Eeden C. Maximum likelihood estimation of partially or completely ordered parameters. II, Koninklijke Nederlandse Akademie van Wetenschappen Proceedings. Series A Mathematical Sciences. 1957b;19:201–211.
- Feller W. An introduction to probability theory and its applications. 2nd edition. vol. II. New York: John Wiley & Sons Inc.; 1971.
- van de Geer S. Hellinger-consistency of certain nonparametric maximum likelihood estimators. The Annals of Statistics. 1993;21:14–44.
- Gneiting T. On the Bernstein–Hausdorff–Widder conditions for completely monotone function. Expositionces Mathaticae. 1998;16:88–119.
- Gneiting T. Radial positive definite functions generated by Euclid’s hat. Journal of Multivariate Analysis. 1999;69:88–119.
- Grenander U. On the theory of mortality measurement. I Skandinavisk Aktuarietidskrift. 1956a;39:70–96.
- Grenander U. On the theory of mortality measurement. II, Skandinavisk Aktuarietidskrift. 1956b;39:125–153.
- Groeneboom P. Estimating a monotone density. Proceedings of the Berkeley conference in honor of Jerzy Neyman and Jack Kiefer Vol. II; Berkeley, CA. 1983. Wadsworth, Belmont, CA: Wadsworth Statist./Probab. Ser.; 1985.
- Groeneboom P. Brownian motion with a parabolic drift and Airy functions. Probability Theory and Related Fields. 1989;81:79–109.
- Groeneboom P. Lectures on probability theory and statistics (Saint-Flour 1994) Berlin: Springer; 1996. Lectures on inverse problems; pp. 67–164. vol. 1648 of Lecture Notes in Mathematics.
- Groeneboom P, Jongbloed G, Wellner JA. Estimation of a convex function: characterizations and asymptotic theory. The Annals of Statistics. 2001;29:1653–1698.
- Groeneboom P, Jongbloed G, Wellner JA. The support reduction algorithm for computing non-parametric function estimates in mixture models. Scandinavian Journal of Statistics. 2008;35:385–399. [PMC free article] [PubMed]
- Hall WJ, Wellner JA. The rate of convergence in law of the maximum of an exponential sample. Statistica Neerlandica. 1979;33:151–154.
- Hampel FR. Design modelling and anlysis of some biological datasets. In: Mallows C, editor. design, data and analysis, by some friends of Cuthbert Daniel. New york: Wiley; 1987. pp. 111–115.
- Jewell NP. Mixtures of exponential distributions. The Annals of Statistics. 1982;10:479–484.
- Jongbloed G. Ph.D. Thesis. Delft University of Technology; 1995. Three statistical inverse problems.
- Jongbloed G. Minimax lower bounds and moduli of continuity. Statistics & Probability Letters. 2000;50:279–284.
- Kim J, Pollard D. Cube root asymptotics. The Annals of Statistics. 1990;18:191–219.
- Lévy P. Extensions d’un théorème de D. Dugué et M, Girault. Probability Theory and Related Fields. 1962;1:159–173.
- Lindsay BG. The geometry of mixture likelihoods: a general theory. The Annals of Statistics. 1983;11:86–94.
- Pfanzagl J. Consistency of maximum likelihood estimators for certain nonparametric families in particular: mixtures. Journal of Statistical Planning and Inference. 1988;19:137–158.
- Prakasa Rao BLS. Estimation of a unimodal density. Sankhyā Series A. 1969;31:23–36.
- Silverman BW. On the estimation of a probability density function by the maximum penalized likelihood method. The Annals of Statistics. 1982;10:795–810.
- van der vaart A, Wellner JA. Hign dimensional probability, II (Seattle, WA, 1999), vol. 47 of Progr. Probab. Boston, MA: BirKhäuser Boston; 2000. Preservation theorems for Glivenko-Cantelli and unifrom Gilvenko-Cantelli classes; pp. 115–133.
- Vardi Y. Multiplicative censoring, renewal process, deconvolution and decreasing density: nonparametric estimation. Biometrika. 1989;76:751–761.
- Williamson RE. Multiply monotone functions and their Laplace transforms. Duke Mathematical Journal. 1956;23:189–207.
- Woodroofe M, Sun J. A penalized maximum likelihood estimate of f(0+) when f is nonincreasing. Statistics Sinica. 1993;3:501–515.
- Zeidler E. Nonlinear functional analysis and its applications. III. New york: Springer-Verlag; 1985. Variational methods and optimization Translated from German by Leo F. Boron.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |