Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2962418

Formats

Article sections

- Abstract
- 1. Introduction
- 2. Notation and Known Results
- 3. Bounds for Integration
- 4. Exact Distributions and Expected Power
- References

Authors

Related links

Commun Stat Theory Methods. Author manuscript; available in PMC 2010 October 22.

Published in final edited form as:

Commun Stat Theory Methods. 2008 January; 37(12): 1855–1866.

doi: 10.1080/03610920801893731PMCID: PMC2962418

NIHMSID: NIHMS184242

Address correspondence to D. H. Glueck, Department of Preventive Medicine and Biometrics, University of Colorado Denver and Health Sciences Center, Campus Box B119, 4200 East Ninth Avenue, Denver, CO 80262, USA; Email: ude.cshcu@kceulG.harobeD

See other articles in PMC that cite the published article.

The Benjamini–Hochberg procedure is widely used in multiple comparisons. Previous power results for this procedure have been based on simulations. This article produces theoretical expressions for expected power. To derive them, we make assumptions about the number of hypotheses being tested, which null hypotheses are true, which are false, and the distributions of the test statistics under each null and alternative. We use these assumptions to derive bounds for multiple dimensional rejection regions. With these bounds and a permanent based representation of the joint density function of the largest p-values, we use the law of total probability to derive the distribution of the total number of rejections. We derive the joint distribution of the total number of rejections and the number of rejections when the null hypothesis is true. We give an analytic expression for the expected power for a false discovery rate procedure that assumes the hypotheses are independent.

Our goal is to provide analytical formulas for the expected power for the Benjamini and Hochberg (1995) false discovery rate procedure. Benjamini and Hochberg (1995)) controlled the False Discovery Rate, the expected ratio of the number of rejections of hypotheses where the null is true (false rejections) to the number of total rejections. With a single hypothesis, power is the probability of rejecting the null hypothesis. With multiple hypotheses, we study average, or expected power (Benjamini and Liu, 1999). Expected power is the expectation of the ratio of the number of rejections to the true number of hypotheses for which the alternative holds.

Many articles describe extensions of the theory for false discovery rate, or related statistics (see, e.g., Benjamini and Liu, 1999; Benjamini and Yekutieli, 2001; Curran-Everett, 2000; Efron et al., 2001; Finner and Roters, 2002; Genovese and Wasserman, 2002, 2004; Sarkar, 2002, 2004, 2006; Storey, 2002, 2003). Power has been studied almost entirely via simulation (Benjamini and Liu, 1999; Keselman et al., 2002; Lee and Whitmore, 2002; Storey, 2002).

In analytic power analysis, it is usual to make assumptions about the underlying state of nature. We assume that we know how many hypotheses we are testing and what statistics we will use to test them. We even assume that we know when each hypothesis is true or false, and the total number of true and false hypotheses. Thus, we know the true distribution of each test statistic. We also know the distribution of the *p*-value, whether uniform if the null hypothesis is true, or a different distribution, calculated under the alternative.

Using these assumptions, the expected power can be calculated by using the law of total probability. The steps in the calculation are as follows.

- Find the cumulative distribution function and the probability density function of the individual
*p*-values when the null hypothesis is true or false. - Show that the probability of rejection depends only on the largest
*p*-values. - Use a computational form for the joint density function of the largest order statistics (Balakrishnan, 2007; Vaughan and Venables, 1972).
- Enumerate and delimit the rejection regions implicitly defined by the Benjamini and Hochberg (1995) procedure in terms of
*p*-value space. - Give an explicit recursive algorithm for generating the rejection regions.
- Use the law of total probability to calculate the probability distribution function of the total number of rejections.
- Derive the joint distribution of the total rejections and false rejections.
- Finally, derive formulas for the expected power from this joint distribution.

In Sec. 2, we define the notation and adapt some previous known results to our problem. In Sec. 3, we prove lemmas on ordered lists. In Sec. 4, we derive the distributions of the number of total rejections, and false rejections, and give our main results on expected power. Technical proofs are in the Appendix.

Suppose one plans to use the Benjamini and Hochberg (1995) false discovery rate procedure for *m* hypotheses and *m* decisions. We take a frequentist, parametric view, and envision each decision as being between a null and an alternative hypothesis, which may differ for each decision. Let *n* {0, … , *m*} be the number of decisions for which a null hypothesis holds in the population, while an alternative hypothesis holds for the remaining *m* − *n* decisions. For *i* {1, … , *m*}, index the hypotheses by *H _{i}*, with associated absolutely continuous, independent, but not necessarily identically distributed real valued test statistic

In frequentist statistics, the null hypothesis is rejected when the *p*-value is smaller than a specified bound. In a multiple comparisons situation, there are many *p*-values. The Benjamini and Hochberg procedure uses the sorted *p*-values to decide which null hypotheses are rejected, and which are not. Let {*P*_{(1)} ≤ ≤ *P*_{(m)}} be the set of order statistics for the *p*-values, with {*P*_{(1)} ≤ ≤ *P*_{(m)}} a realization. Let *α*_{*} [0, 1] and *b _{i}* =

$${\mathcal{R}}_{k}=\{\begin{array}{cc}\left\{\bigcap _{i=1}^{m}({p}_{\left(i\right)}\ge {b}_{i})\right\}\hfill & k=0\hfill \\ \left\{\bigcap _{i=k+1}^{m}({p}_{\left(i\right)}\ge {b}_{i})\cap ({p}_{\left(k\right)}\le {b}_{k})\right\}\hfill & 1\le k\le m-1\hfill \\ \{{p}_{\left(m\right)}\le {b}_{m}\}\hfill & k=m.\hfill \end{array}\phantom{\}}$$

(1)

Now consider repeating the experiment over and over again. Each time, because the data are stochastic, the number of rejections would vary. Thus, the number of rejections is a random variable. Let *K* {0, … , *m*} be the random variable denoting the number of rejections of null hypotheses in an experiment, and *k* its realization. Rejecting a null hypothesis is a decision which may be correct, if the alternative holds, or incorrect, if the null actually holds. Rejections of hypotheses where the null is true are Type I errors. Let *J* be the number of rejections for which the null does hold, and *j* its realization. In general, *j* {max[0, *k* − (*m* − *n*)], … , min(*n*, *k*)}. Table 1 illustrates the relationships.

To find the distribution of *K* and *J*, we need to understand the probability density of the largest *p*-values. This depends on the distribution of the test statistics. Typically, the distribution of *T _{i}* differs under the null and alternative hypotheses. Let

Using this notation, we can define the distribution and density function for each *p*-value. Since the test statistic *T _{i}* is absolutely continuous under the null and the alternative, the distribution function is smooth and monotone increasing, and the inverse distribution function exists. With

$${t}_{i}={F}_{{T}_{i}}^{-1}[1-{p}_{i};{N}_{i},{\theta}_{i}\left(h\right)].$$

(2)

Let ${q}_{i}={F}_{{T}_{i}}^{-1}[1-{p}_{i};{N}_{i},{\theta}_{i}\left(0\right)]$. For a one-tailed test for which larger values lead to rejection, the distribution and density functions for *P _{i}* are:

$${F}_{{P}_{i}}[{p}_{i};{N}_{i},{\theta}_{i}\left(h\right)]=1-{F}_{{T}_{i}}[{q}_{i};{N}_{i},{\theta}_{i}\left(h\right)],$$

(3)

$${f}_{{P}_{i}}[{p}_{i};{N}_{i},{\theta}_{i}\left(h\right)]=\frac{{f}_{{T}_{i}}[{q}_{i};{N}_{i},{\theta}_{i}\left(h\right)]}{{f}_{{T}_{i}}[{q}_{i};{N}_{i},{\theta}_{i}\left(0\right)]}.$$

(4)

Let ${q}_{Ui}={F}_{{T}_{i}}^{-1}[1-{p}_{i}\u22152;{N}_{i},{\theta}_{i}\left(0\right)]$ and ${q}_{Li}={F}_{{T}_{i}}^{-1}[{p}_{i}\u22152;{N}_{i},{\theta}_{i}\left(0\right)]$. For a two-tailed test with equal tail probabilities, the distribution for *P _{i}* is

$${F}_{{P}_{i}}[{p}_{i};{N}_{i},{\theta}_{i}\left(h\right)]=1-{F}_{{T}_{i}}[{q}_{Ui};{N}_{i},{\theta}_{i}\left(h\right)]+{F}_{{T}_{i}}[{q}_{Li};{N}_{i},{\theta}_{i}\left(h\right)].$$

(5)

Using Theorem 12.4.4, (Leithold, 1968, p. 410), the density function is:

$${f}_{{P}_{i}}[{p}_{i};{N}_{i},{\theta}_{i}\left(h\right)]=\frac{1}{2}\cdot \frac{{f}_{{T}_{i}}[{q}_{Ui};{N}_{i},{\theta}_{i}\left(h\right)]}{{f}_{{T}_{i}}[{q}_{Ui};{N}_{i},{\theta}_{i}\left(0\right)]}+\frac{1}{2}\cdot \frac{{f}_{{T}_{i}}[{q}_{Li};{N}_{i},{\theta}_{i}\left(h\right)]}{{f}_{{T}_{i}}[{q}_{Li};{N}_{i},{\theta}_{i}\left(0\right)]}.$$

(6)

For clarity, abbreviate the distribution function and density for *P _{i}* by

We need to look at the joint density of the *p*-values to understand the power for the Benjamini and Hochberg procedure. But which *p*-values? Suppose *m* = 10 and *k* = 4, so 4 hypotheses are rejected by the Benjamini and Hochberg procedure. The idea here is that the four smallest *p*-values are less than their bounds. We need to look also at the 5th smallest *p*-value, to make sure that it is bigger than its bound. If it were smaller than its bounds, we would have rejected five, not four. We also need to look at the 6th smallest one, and the 7th, 8th, 9th, and 10th. In fact, we need the joint density of the largest (*m* − *k*) + 1 of *m* of the *p*-values to figure out the probability of rejecting *k* hypotheses.

A convenient form of the joint density of independent, but not necessarily identically distributed random variables is in terms of a permanent (Balakrishnan, 2007; Vaughan and Venables, 1972). The permanent of a square matrix is defined like the determinant, except that all signs are positive (Aitken, 1999, p. 30). If ** A** is a square matrix, we write its permanent by per [

$$\begin{array}{cc}\hfill & {f}_{{P}_{\left(k\right)},\dots ,{P}_{\left(m\right)}}({p}_{\left(k\right)},\dots ,{p}_{\left(m\right)})\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}={[(k-1)!]}^{-1}\times \text{per}\left[\begin{array}{cccc}\hfill {F}_{{P}_{1}}\left({p}_{\left(k\right)}\right)\hfill & \hfill {F}_{{P}_{2}}\left({p}_{\left(k\right)}\right)\hfill & \hfill \cdots \hfill & \hfill {F}_{{P}_{m}}\left({p}_{\left(k\right)}\right)\hfill \\ \hfill \vdots \hfill & \hfill \vdots \hfill & \hfill \hfill & \hfill \vdots \hfill \\ \hfill {F}_{{P}_{1}}\left({p}_{\left(k\right)}\right)\hfill & \hfill {F}_{{P}_{2}}\left({p}_{\left(k\right)}\right)\hfill & \hfill \cdots \hfill & \hfill {F}_{{P}_{m}}\left({p}_{\left(k\right)}\right)\hfill \\ \hfill ----\hfill & \hfill ----\hfill & \hfill -\hfill & \hfill ---\hfill \\ \hfill {f}_{{P}_{1}}\left({p}_{\left(k\right)}\right)\hfill & \hfill {f}_{{P}_{2}}\left({p}_{\left(k\right)}\right)\hfill & \hfill \cdots \hfill & \hfill {f}_{{P}_{m}}\left({p}_{\left(k\right)}\right)\hfill \\ \hfill \vdots \hfill & \hfill \vdots \hfill & \hfill \hfill & \hfill \vdots \hfill \\ \hfill {f}_{{P}_{1}}\left({p}_{\left(m\right)}\right)\hfill & \hfill {f}_{{P}_{2}}\left({p}_{\left(m\right)}\right)\hfill & \hfill \cdots \hfill & \hfill {f}_{{P}_{m}}\left({p}_{\left(m\right)}\right)\hfill \end{array}\right],\hfill \end{array}$$

(7)

where the first block contains *k* − 1 rows, and the second block contains (*m* − *k*) + 1 rows.

When the test statistics are identically distributed, the result reduces (David, 1981, p. 10) to

$${f}_{{P}_{\left(k\right)},\dots ,{P}_{\left(m\right)}}({p}_{\left(k\right)},\dots ,{p}_{\left(m\right)})=\frac{m!{\left[F\left({p}_{\left(k\right)}\right)\right]}^{k-1}}{(k-1)!}\prod _{i=k}^{m}f\left({p}_{\left(i\right)}\right).$$

(8)

In Eq. (1), we described the rejection regions corresponding to the Benjamini and Hochberg (1995) procedure. Although these are exact definitions of the rejection regions, they do not describe all the ways the *p*-values could be arranged with respect to the bounds. If we understand all the ways that the *p*-values can be arranged with respect to the bounds, we can define a set of rejection regions in *p*-value space. If we integrate the joint density of the *p*-values over this rejection region, we will get the probability distribution of the number of rejections.

There are many ways the *p*-values can be arranged with respect to the bounds and still cause a rejection. First, let us consider a small example, and demonstrate how we can proceed from a pictorial representation, to a set of ordered lists, to a set of integration bounds. Suppose that we are considering two hypotheses, so *m* = 2. How many ways can we reject both hypotheses? Let *b*_{1} = *α* * /2, and *b*_{2} = *α**. The rejection region is {*p*_{(2)} ≤ *b*_{2}}. Notice that the rejection region can be satisfied in three different ways, shown as number lines in Fig. 1.

We can represent the number lines shown in Fig. 1 abstractly as ordered lists. The ordered lists corresponding to Fig. 1 are shown in Fig. 2. We still need to define the bounds of the rejection region. For this small example, the bounds are shown in Fig. 3.

For the general case, we need to formally define ordered lists. Suppose *q* is a finite, positive integer, and *a*_{1}, … , *a _{q}* are real numbers, with

$$\mathcal{C}=\mathcal{A}\&\mathcal{B},$$

(9)

with $\mathcal{C}$ an ordered list whose elements are the entries in $\mathcal{A}$ followed by those in $\mathcal{B}$.

Now we will give two lemmas that describe how to generate and use ordered lists for more general cases. We will use these ordered lists to define the rejection regions for Benjamini and Hochberg (1995). The proofs are given in the Appendix. The first lemma shows how to generate these ordered lists for a specific number of hypothesis tests.

The second lemma shows that these ordered lists correspond to ordered sets of ordered pairs that are bounds for integration. Integrating the marginal density of the (*m* − *k*) + 1 smallest order statistics of the *p*-values over these bounds will give the probabilities of rejection.

*Suppose k* {0, … , *m*} *indexes the number of rejections. Let*

$${c}_{k}=[2\cdot (m-k)]!\u2215[(m-k)!(m-k)!(m-k+1)].$$

(10)

*Let p* {1, … , *c _{k}*},

$$0\le {e}_{k,p,1}\le \cdots \le {e}_{k,p,2k+2}\le 1.$$

(11)

*Notate the list itself by letting*

$${\mathcal{L}}_{k,p}=\{0,{e}_{k,p,1},\dots ,{e}_{k,p,2k+2},1\}.$$

(12)

*Let* ${\mathcal{L}}_{k}$ *be the set of such ordered lists so that*

$${\mathcal{L}}_{k}=\{{\mathcal{L}}_{k,1},{\mathcal{L}}_{k,2},\dots ,{\mathcal{L}}_{k,{c}_{k}}\}.$$

(13)

*Then the set* ${\mathcal{L}}_{k-1}$ *can be generated from the set* ${\mathcal{L}}_{k}$. *The number of entries in* ${\mathcal{L}}_{k}$ *is c _{k}, a Catalan number.*

*Let k* {0, … , *m*} *index the number of rejections and p* {1, … , *c _{k}*} be an index variable for the number of ordered lists in the set ${\mathcal{L}}_{k}$.

$${\mathcal{B}}_{k,p}=\{\begin{array}{cc}\{(0,{u}_{m,p,1}),({l}_{m,p,2},{u}_{m,p,2}),\dots ,({l}_{m,p,k},{u}_{m,p,k})\}\hfill & k=0\hfill \\ \{(0,{u}_{k,p,1}),({l}_{k,p,2},{u}_{k,p,2}),\dots ,({l}_{k,p,k},{u}_{k,p,k}),({b}_{k+1},1)\}\hfill & 1\le k\le m-1\hfill \\ \left\{(0,{b}_{m})\right\}\hfill & k=m.\hfill \end{array}\phantom{\}}$$

(14)

*Let* ${\mathcal{B}}_{k}$ *be the set of such ordered lists so that*

$${\mathcal{B}}_{k}=\{{\mathcal{B}}_{k,1},\dots ,{\mathcal{B}}_{k,{c}_{k}}\}.$$

(15)

*Then*:

- ${\mathcal{B}}_{k,p}$
*can be formed from the ordered lists*${\mathcal{L}}_{k,p}$*in Lemma*3.1*by construction*. - ${\mathcal{B}}_{k}$
*can be formed from the set*${\mathcal{L}}_{k}$. *The number of entries in*${\mathcal{B}}_{k}$*is c*._{k}*For*0 ≤*k*≤*m*− 1,*the k*+ 1*ordered pairs in*${\mathcal{B}}_{k,p}$*are bounds of integration for a k*+ 1*dimensional integral. For k*=*m, the ordered pair in*${\mathcal{B}}_{m,p}$ is the bounds of integration for a one dimensional integral.*The ordered pairs in*${\mathcal{B}}_{k,p}$*delineate regions in*${\Re}^{k+1}$,*and the ordered pair in*${\mathcal{B}}_{m,p}$*delineates a region in*.*These are the rejection regions from*Eq. (1).

We now have expressions for the joint density of the order statistics of the *p*-values (Eq. (7)), and an expression for the bounds of the rejection regions (Lemma 3.1). Using the law of total probability, in this section we derive the probability distribution for the total number of rejections, and the joint distribution of the total number of rejections and the number of false rejections. We use these results to find expected power.

The distribution of the number of rejections for independent, but not necessarily identically distributed random variables, and corresponding *p*-values is given by:

$$\mathrm{Pr}\{K=k\}=\{\begin{array}{cc}\underset{p=1}{\overset{{c}_{0}}{\Sigma}}{\int}_{{\mathcal{B}}_{0,p}}m!\prod _{i=1}^{m}{f}_{{P}_{i}}\left({p}_{\left(i\right)}\right)d{p}_{\left(1\right)}\dots d{p}_{\left(m\right)}\hfill & k=0\hfill \\ \underset{p=1}{\overset{{c}_{k}}{\Sigma}}{\int}_{{\mathcal{B}}_{k,p}}{f}_{{P}_{\left(k\right)},\dots ,{P}_{\left(m\right)}}({p}_{\left(k\right)},\dots ,{p}_{\left(m\right)})d{p}_{\left(k\right)}\dots d{p}_{\left(m\right)}\hfill & 1\le k\le m-1\hfill \\ \prod _{i=1}^{m}{F}_{{P}_{i}}\left({b}_{m}\right)\hfill & k=m.\hfill \end{array}\phantom{\}}$$

(16)

For independent and identically distributed random variables, and corresponding *p*-values

$$\mathrm{Pr}\{K=k\}=\{\begin{array}{cc}\underset{p=1}{\overset{{c}_{0}}{\Sigma}}{\int}_{{\mathcal{B}}_{0,p}}m!\prod _{i=1}^{m}f\left({p}_{\left(i\right)}\right)d{p}_{\left(1\right)}\dots d{p}_{\left(m\right)}\hfill & k=0\hfill \\ \underset{p=1}{\overset{{c}_{k}}{\Sigma}}{\int}_{{\mathcal{B}}_{k,p}}\frac{m!{\left[F\left({p}_{\left(k\right)}\right)\right]}^{k-1}}{(k-1)!}\prod _{i=k}^{m}f\left({p}_{\left(i\right)}\right)d{p}_{\left(k\right)}\dots d{p}_{\left(m\right)}\hfill & 1\le k\le m-1\hfill \\ {\left[F\left({b}_{m}\right)\right]}^{m}\hfill & k=m.\hfill \end{array}\phantom{\}}$$

(17)

For independent and identically and uniformly distributed random variables,

$$\mathrm{Pr}\{K=k\}=\{\begin{array}{cc}1-\alpha \ast \hfill & k=0\hfill \\ \underset{p=1}{\overset{{c}_{k}}{\Sigma}}{\int}_{{\mathcal{B}}_{k,p}}\frac{m!{\left({p}_{\left(k\right)}\right)}^{k-1}}{(k-1)!}d{p}_{\left(k\right)}\dots d{p}_{\left(m\right)}\hfill & 1\le k\le m-1\hfill \\ {\left({b}_{m}\right)}^{m}\hfill & k=m.\hfill \end{array}\phantom{\}}$$

(18)

Results 4.4–4.6 follow from the law of total probability. The probability of rejection is given by integrating the appropriate density over the multidimensional region. Result 4.3 occurs when the null holds for every hypothesis, and the *p*-values are then uniformly, and identically distributed.

Now, we need to derive the joint probability distribution of the total number of rejections and the number of false rejections. Without loss of generality, we can label the hypotheses *H _{i}* so that for

Index the *m*! terms in the expansion of the permanent in Eq. (7) by *v* {1, … , *m*!}, and denote term *v* by *a _{v}*. Each term is a product of

$${f}_{{P}_{\left(k\right)},\dots ,{P}_{\left(m\right)}}({p}_{\left(k\right)},\dots ,{p}_{\left(m\right)})=\sum _{j}\sum _{v:{a}_{v}\in {\mathcal{C}}_{n-j}}{a}_{v}.$$

(19)

If the number of hypotheses for which the alternative holds is zero, then *n* = *m*. Then the number of rejections of null hypotheses is equal to the number of total rejections and *k* = *j*. Then Pr{*J* = *j*, *K* = *k*} = Pr{*J* = *j*} = Pr{*K* = *k*}, the distribution in Result 4.1.

We also give some special cases that may be useful, depending on the testing situation. They can be derived from Results 4.2 and 4.3.

For independent, but not necessarily identically distributed random variables, and with *j* {max[0, *k* − (*m* − *n*)], … , min(*n, k*)},

$$\mathrm{Pr}\{J=j,K=k\}=\{\begin{array}{cc}\underset{p=1}{\overset{{c}_{0}}{\Sigma}}{\int}_{{\mathcal{B}}_{0,p}}m!\prod _{i=1}^{m}{f}_{{P}_{i}}\left({p}_{\left(i\right)}\right)d{p}_{\left(1\right)}\dots d{p}_{\left(m\right)}\hfill & k=0,j=0\hfill \\ \underset{p=1}{\overset{{c}_{k}}{\Sigma}}{\int}_{{\mathcal{B}}_{k,p}}\underset{v:{a}_{v}\in {\mathcal{C}}_{n-j}}{\Sigma}{a}_{v}d{p}_{\left(k\right)}\dots d{p}_{\left(m\right)}\hfill & 1\le k\le m-1\hfill \\ \prod _{i=1}^{m}{F}_{{P}_{i}}\left({b}_{m}\right)\hfill & k=m,j=n.\hfill \end{array}\phantom{\}}$$

(20)

With * _{a}C_{b}* =

$$\mathrm{Pr}\{J=j,K=k\}{=}_{n}{C}_{j}{\cdot}_{(m-n)}{C}_{(k-j)}\cdot \mathrm{Pr}\{K=k\}{\u2215}_{m}{C}_{k}.$$

(21)

Results 4.4–4.6 follow from direct applications of the law of total probability.

We now have the joint distribution of the number of true rejections, *K*, and the number of rejections of nulls, *J*. Power for one realization of the experiment is (*k* − *j*)/(*m* − *n*). The numerator is the total number of rejections minus the number of nulls which are rejected. Thus, it is the number of hypotheses for which the alternative is true that are rejected. The denominator is the number of hypotheses for which the alternative is true. This quantity is also known as sensitivity. Benjamini and Liu (1999) called the expected value of this quantity expected power, and used it in simulations as measure of the success of an experiment that tested multiple hypotheses. We can now give an explicit, analytic formula for expected power.

For *m* ≠ *n*, i.e., when the null is not true for every hypothesis, the expected power is given by:

$$\mathcal{E}[(K-J)\u2215(m-n)]=\sum _{k=0}^{m}\sum _{j}[(k-j)\u2215(m-n)]\mathrm{Pr}\{K=k;J=j\}.$$

(22)

A formula of Lindgren (1976, p. 116) gives the expected value of a function of two random variables in terms of the joint distribution. The result then follows from the definitions of the various estimators.

The expected power is given in Eq. (22). How does the power depend on the sample size? The distribution of each test statistic, *T _{i}*, includes the sample size for the hypothesis

The authors thank Gary Grunwald for his close reading and numerous suggestions, which greatly improved the manuscript.

Glueck was supported by NCI K07CA88811. Muller was supported by NCI P01 CA47 982-04, NCI R01 CA095749-01A1, and NIAID 9P30 AI 50410. Hunter was supported by NLM 5R01LM008111-03 and NCI 5 P30 CA46934-15.

By construction.

- With
*k*=*m*, let$${\mathcal{L}}_{m}={\mathcal{L}}_{m,1}=\{0,{p}_{\left(m\right)},{b}_{m},1\}.$$(23) - If
*k*= 0, then do the following:- In ${\mathcal{L}}_{k,p}$, for all
*p*{1, … ,*c*} delete_{k}*p*_{(0)}and*b*_{0}. - Stop generating lists.

- Otherwise, repeat the following steps for
*p*{1, … ,*c*}. Consider ${\mathcal{L}}_{k,p}$._{k}- Insert the ordered list {
*p*_{(k−1)},*b*_{k−1}} after 0 in each ordered list that corresponds to*k*rejections. - Remove
*p*_{(k)}from the list. - From the resulting list, remove the prefix, which is the ordered list {0, … ,
*b*}._{k} - From the resulting list, remove the suffix, which is the ordered list {
*p*, 1}. If_{(k+1}*k*=*m*, remove the suffix, which is the ordered list {1}. - What remains is the core. If the core has nothing in it, insert
*p*_{(k)}. Otherwise, insert*p*_{(k)}sequentially before and after every entry of the core. The resulting ordered lists are the new cores. - For each core, insert the prefix at the beginning, and add the suffix onto the end. The result set of ordered lists are the elements of the set ${\mathcal{L}}_{k-1}$, the set of ordered lists that correspond to
*k*− 1 rejections.

- Let
*k*=*k*− 1, and go to Step 2.

There are five separate statements in Lemma 3.2. We present the proofs of the five statements in order.

- By construction. ${\mathcal{B}}_{m,p}=\left\{(0,{b}_{1})\right\}$ by definition. For
*k*{0, … ,*m*− 1}, ${\mathcal{B}}_{k,p}$ can be formed from ${\mathcal{L}}_{k,p}$ using the following algorithm.- Write the first three elements of ${\mathcal{L}}_{k,p}$ as an ordered list. Call it the trio. The fourth through last elements of ${\mathcal{L}}_{k,p}$ call the
*remainder*.- If the middle element is of the form
*p*_{(o)}, define*l*_{k,p,1}to be the first element of the trio, and define*u*_{k,p1}to be the third element of the trio, and then do the following. Otherwise, proceed to Step ii.- Add the ordered pair (
*l*_{k,p,1},*u*_{k,p,1}) to the ordered list ${\mathcal{B}}_{k,p}$. - Form a new trio with the same first element. The middle element of the new trio is the third element of the original trio. The third element of the new trio is the first element of the remainder.
- Remove the first element of the remainder. The new remainder is the old remainder with the first element removed.

- If the middle element of the trio is of the form
*b*, then do the following._{j}- Delete the first element of the trio.
- Form a new trio. Let the middle element be the new first element, the new second element be the old third element, and let the new third element be the first item in the remainder.
- Remove the first element of the remainder. The new remainder is the old remainder with the first element removed.

- Using the new trio and the new remainder defined above, repeat Steps i and ii until all the elements of ${\mathcal{L}}_{k,p}$ are exhausted.

- ${\mathcal{B}}_{k}$ can be formed from the set ${\mathcal{L}}_{k}$ in the following manner. First, form all the ${\mathcal{B}}_{k,p}$ from ${\mathcal{L}}_{k,p}$ in ${\mathcal{L}}_{k}$ as shown above. Second, ${\mathcal{B}}_{k}$ is simply the set of all the ${\mathcal{B}}_{k,p}$.
- This holds since the number of elements in ${\mathcal{L}}_{k}$ is
*c*and there are the same number of elements in ${\mathcal{L}}_{k}$ and ${\mathcal{B}}_{k}$._{k} - By inspection.
- They correspond to the rejection regions by construction.

**Mathematics Subject Classification** Primary 62H15; Secondary 65F03.

- Aitken AC. Determinants and Matrices. Oliver and Boyd; Edinburgh: 1999.
- Balakrishnan N. Permanents, order statistics, outliers and robustness. Revista Matematica Complutense. 2007;20:7–107.
- Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B Statist. Methodol. 1995;57:289–300.
- Benjamini Y, Liu W. A step-down multiple hypotheses testing procedure that controls the false discovery rate under independence. J. Statist. Plann. Infer. 1999;82:163–170.
- Benjamini Y, Yekutieli D. The control of the false discovery rate in multiple testing under dependency. Ann. Statist. 2001;29:1165–1188.
- Curran-Everett D. Multiple comparisons: philosophies and illustrations. Am. J. Physiol. Regul. Integr. Comp. Physiol. 2000;279:R1–R8. [PubMed]
- David HA. Order Statistics. 2nd ed. Wiley; New York: 1981.
- Efron B, Storey J, Tibshirani R. Microarrays, empirical Bayes methods, and false discovery rates. J. Amer. Statist. Assoc. 2001;96:1151–1160.
- Finner H, Roters M. Multiple hypothesis testing and expected number of Type 1 errors. Ann. Statist. 2002;30(1):220–238.
- Genovese CR, Wasserman L. Operating characteristics and extensions of the false discovery rate procedure. J. Roy. Stat. Soc. Ser. B Statist. Methodol. 2002;64:499–517.
- Genovese CR, Wasserman L. A stochastic process approach to false discovery control. Ann. Statist. 2004;32(3):1035–1061.
- Keselman HJ, Cribbie R, Holland B. Controlling the rate of Type 1 error over a large set of statistical tests. Bri. J. Math. Statist. Psych. 2002;55:27–39. [PubMed]
- Lee M, Whitmore G. Power and sample size for DNA microarray studies. Statist. Med. 2002;21:3543–3570. [PubMed]
- Leithold L. The Calculus with Analytic Geometry. Harper and Row; New York: 1968.
- Lindgren BW. Statistical Theory. 3rd ed. Macmillan Publishing; New York: 1976.
- Sarkar SK. Some results on false discovery rate in stepwise multiple testing procedures. Ann. Statist. 2002;30(1):239–257.
- Sarkar SK. FDR-controlling stepwise procedures and their false negatives rates. J. Statist. Plann. Infer. 2004;125:119–137.
- Sarkar SK. False discovery and false nondiscovery rates in single-Step multiple testing procedures. Ann. Statist. 2006;34(1):394–415.
- Storey J. A direct approach to the false discovery rate. J. Roy. Statist. Soc. Ser. B Statist. Methodol. 2002;64:479–598.
- Storey J. The positive false discovery rate: a Bayesian interpretation and the
*q*-value. Ann. Statist. 2003;31(6):2013–2035. - Vaughan RJ, Venables WN. Comments and queries: permanent expressions for order statistic densities. J. Roy. Statist. Soc. Ser. B Statist. Methodol. 1972;34:308–310.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |