Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2678871

Formats

Article sections

- Summary
- 1. Introduction
- 2. Bayesian formulation for changepoint problems
- 3. Exact IBF sampling and marginal likelihood calculation
- 4. Poisson models with changepoints
- 5. Analysis of the HUS data
- 6. Simulation Studies
- 7. Discussion
- References

Authors

Related links

Comput Stat Data Anal. Author manuscript; available in PMC 2010 July 1.

Published in final edited form as:

Comput Stat Data Anal. 2009 July 1; 53(9): 3314–3323.

doi: 10.1016/j.csda.2009.02.006PMCID: PMC2678871

NIHMSID: NIHMS96244

Diarrhoea-associated *Haemolytic uraemic syndrome* (HUS) is a disease that affects the kidneys and other organs. Motivated by the annual number of cases of HUS collected in Birmingham and Newcastle of England, respectively, from 1970 to 1989, we consider Bayesian changepoint analysis with specific attention to Poisson changepoint models. For changepoint models with unknown number of changepoints, we propose a new non-iterative Bayesian sampling approach (called exact IBF sampling), which completely avoids the problem of convergence and slow convergence associated with iterative *Markov chain Monte Carlo* (MCMC) methods. The idea is to first utilize the sampling *inverse Bayes formula* (IBF) to derive the conditional distribution of the latent data given the observed data, and then to draw iid samples from the complete-data posterior distribution. For the purpose of selecting the appropriate model (or determining the number of changepoints), we develop two alternative formulae to exactly calculate marginal likelihood (or Bayes factor) by using the exact IBF output and the point-wise IBF, respectively. The HUS data are re-analyzed using the proposed methods. Simulations are implemented to validate the performance of the proposed methods.

Diarrhoea-associated *Haemolytic uraemic syndrome* (HUS) is a disease that affects the kidneys and other organs. It poses a substantial threat to infants and young children as one of the leading causes of both acute and chronic kidney failures. HUS is most common in the warmer months of the year, following a gastrointestinal illness caused primarily by a particular strain of bacterium, Escherichia Coli O157:H7 (Milford *et al*., 1990). These bacteria (E. Coli O157:H7) produce extremely potent toxins which are the main cause of the symptoms related to the gastrointestinal illness. Table 1 displays the annual number of cases of HUS collected in Birmingham and Newcastle of England, respectively, from 1970 to 1989 (Tarr *et al*., 1989; Henderson and Matthews, 1993). The primary concern is the incidence of HUS and when the frequency of cases increases sharply. In the mean-corrected cumulative sum plot (Figure 1), the annual totals appear to increase abruptly at about 1980 for the Birmingham series and 1976, 1984 for the Newcastle series. Therefore, a changepoint analysis of the data with Poisson models seems to be appropriate.

Counts of cases of HUS at Birmingham and Newcastle (Tarr *et al*., 1989)

*Changepoint problems* (CPPs) are often encountered in medicine and other fields, e.g., economics, finance, psychology, signal processing, industrial system control and geology. Typically, a sequence of data is collected over a period of time, we wish to make inference about the location of one or more points of the sequence at which there is a change in the model. The literature on CPPs is extensive. For Poisson process CPPs, a well-known example concerns British coal-mining disasters from 1851–1962 (originally gathered by Maguire *et al*. (1952) and corrected by Jarrett (1979)). Frequentist investigations appear in Worsley (1986) and Siegmund (1988), while traditional Bayesian analysis and *Markov chain Monte Carlo* (MCMC) hierarchical Bayesian analysis are presented in Raftery and Akman (1986) and Carlin, Gelfand and Smith (1992), respectively. Arnold (1993) considered the application of the Gibbs sampler to a Poisson distribution with a changepoint. For binomial CPPs, Smith (1975) presented the conventional Bayesian approach for a finite sequence of independent observations with details on binomial single-changepoint model. Smith (1980) studied binomial multiple-changepoint model, which were investigated by Stephens (1994) using the Gibbs sampler. For binary CPPs, Halpern (1999) applied a novel changepoint statistic based on the minimum value, over possible changepoint locations of Fisher’s Exact Test to assessing recombination in genetic sequences of HIV. For multiple change-point models, Chib (1998) provided a comparison study and Fearnhead and Liu (2007) proposed an on-line algorithm. Three comprehensive reviews on CPPs are provided by Brodsky and Darkhovsky (1993), Chen and Gupta (2000) and more recently by Wu (2005).

The primary objective in the analysis of CPPs is to make inferences about unknown changepoints and the associated parameters. Although the MCMC methods can be employed in such Bayesian analyses, in our viewpoint, the difficulties lie in monitoring the convergence of the Markov chains. In addition, they could suffer from slow convergence. These issues have prompted some researchers to take the view that the MCMC methods are to be used only when there is no better alternative (see, e.g., discussions in Evans and Swartz (1995, 2000) and Hobert and Casella (1996)). In this article, we first propose a new non-iterative Bayesian sampling approach (called exact IBF sampling), which completely avoids the problem of convergence and slow convergence. The idea is to first utilize the sampling-wise *inverse Bayes formulae* (IBF, Tan *et al*., 2003) to derive the conditional distribution of the missing data given the observed data, and then to draw iid samples from the complete-data posterior distribution.

In practice, we are generally uncertain about the number of changepoints. Hence, model determination is the first task in changepoint analysis. Let *M _{s}* represent a model with

The rest of this paper is organized as follows. In Section 2, we formulate the general Bayesian changepoint models. In Section 3, we propose a non-iterative Bayesian sampling approach (called the exact IBF sampling) and derive two simple formulae to exactly calculate the marginal likelihood. Section 4 considers Poisson models with single and multiple changepoints and the corresponding Bayesian model selection. In Section 5, we re-analyze the HUS data using the proposed methods. Simulations are conducted to validate the performance of the proposed methods in Section 6. Conclusion and comment are presented in Section 7.

Let
${Y}_{\text{obs}}={\{{y}_{i}\}}_{i=1}^{n}$ denote a realization of the sequence of independent random variables
${\{{Y}_{i}\}}_{i=1}^{n}$ of length *n*. The random variables
${\{{Y}_{i}\}}_{i=1}^{n}$ are said to have a changepoint at *r* (1 ≤ *r* ≤ *n*) if *Y _{i}* ~

$$L({Y}_{\text{obs}}\mid r,{\theta}_{1},{\theta}_{2})={\mathrm{\Pi}}_{i=1}^{r}{f}_{1}({y}_{i}\mid {\theta}_{1})\xb7{\mathrm{\Pi}}_{i=r+1}^{n}{f}_{2}({y}_{i}\mid {\theta}_{2}).$$

(2.1)

Using *π*(*r*, *θ*_{1}, *θ*_{2}) as a joint prior distribution for *r*, *θ*_{1}, and *θ*_{2}, the joint posterior distribution is given by

$$f(r,{\theta}_{1},{\theta}_{2}\mid {Y}_{\text{obs}})\propto L({Y}_{\text{obs}}\mid r,{\theta}_{1},{\theta}_{2})\xb7\pi (r,{\theta}_{1},{\theta}_{2}).$$

(2.2)

This single-changepoint problem can be easily generalized to incorporate multiple changes in the sequence. The Bayesian formulation for the multiple-changepoint problem is almost identical with that for the single-changepoint problem. Let *M _{s}* represent a model with

$$\begin{array}{l}f(\mathit{r},{\theta}_{1},\dots ,{\theta}_{s+1}\mid {Y}_{\text{obs}})\propto L({Y}_{\text{obs}}\mid \mathit{r},{\theta}_{1},\dots ,{\theta}_{s+1})\xb7\pi (\mathit{r},{\theta}_{1},\dots ,{\theta}_{s+1})\\ =\{{\mathrm{\Pi}}_{j=1}^{s+1}{\mathrm{\Pi}}_{i={r}_{j-1}+1}^{{r}_{j}}{f}_{j}({y}_{i}\mid {\theta}_{j})\}\xb7\pi (\mathit{r},{\theta}_{1},\dots ,{\theta}_{s+1}),\end{array}$$

(2.3)

where *r*_{0} 0, *r _{s}*

$$\mathcal{S}(\mathit{r}\mid {Y}_{\text{obs}})=\{\mathit{r}:1\le {r}_{1}<\cdots <{r}_{s}\le n,{r}_{j}\phantom{\rule{0.16667em}{0ex}}\text{is}\phantom{\rule{0.16667em}{0ex}}\text{an}\phantom{\rule{0.16667em}{0ex}}\text{integer},j=1,\dots ,s\}.$$

(2.4)

The primary objective is to make inferences about the unknown changepoints ** r** and the unknown parameters (

For a given model, let *Y*_{obs} denote the observed data, *Z* the missing data or latent data (e.g., changepoints) and *θ* the model-specific parameter vector. We further denote the likelihood function by *L*(*Y*_{obs}|*θ*) and the marginal density of *Y*_{obs} (equivalently, the marginal likelihood) by *m*(*Y*_{obs}). Within a Bayesian framework, we assume that the prior density is *π*(*θ*). Two basic tasks are (i) for the purpose of Bayesian inferences, how to obtain iid samples from the observed posterior *f*(*θ*|*Y*_{obs}) or equivalently from the joint posterior *f*(*Z*, *θ*|*Y*_{obs}), and (ii) for the purpose of Bayesian model choice, how to exactly calculate the marginal likelihood *m*(*Y*_{obs}) for the given model.

In general, we can obtain explicit expressions for both the complete-data posterior distribution *f*(*θ*|*Y*_{obs}, *Z*) and the conditional predictive distribution *f*(*Z*|*Y*_{obs}, *θ*), that is, the sampling from the two distributions and the evaluation of the two densities can routinely be implemented. The fundamental conditional sampling principle implies

$$f(Z,\theta \mid {Y}_{\text{obs}})=f(Z\mid {Y}_{\text{obs}})\xb7f(\theta \mid {Y}_{\text{obs}},Z),$$

which states that if we could draw
${Z}^{(\ell )}\stackrel{\text{iid}}{\sim}f(Z\mid {Y}_{\text{obs}})$ and simulate *θ*^{()} ~ *f*(*θ*|*Y*_{obs}, *Z*^{()}), then
${\{({Z}^{(\ell )},{\theta}^{(\ell )})\}}_{1}^{L}$ are iid samples from the joint posterior *f*(*Z*, *θ*|*Y*_{obs}). Therefore, the key is to be able to generate iid samples from *f*(*Z*|*Y*_{obs}).

Let (*θ*|*Y*_{obs}) and (*Z*|*Y*_{obs}) denote the conditional supports of *θ*|*Y*_{obs} and *Z*|*Y*_{obs}, respectively. The sampling IBF shows that (Tan *et al*., 2003)

$$f(Z\mid {Y}_{\text{obs}})\propto \frac{f(Z\mid {Y}_{\text{obs}},{\theta}_{0})}{f({\theta}_{0}\mid {Y}_{\text{obs}},Z)},\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\begin{array}{l}\text{for}\phantom{\rule{0.16667em}{0ex}}\text{an}\phantom{\rule{0.16667em}{0ex}}\text{arbitrary}\phantom{\rule{0.16667em}{0ex}}{\theta}_{0}\in \mathcal{S}(\theta \mid {Y}_{\text{obs}})\hfill \\ \text{and}\phantom{\rule{0.16667em}{0ex}}\text{all}\phantom{\rule{0.16667em}{0ex}}Z\in \mathcal{S}(Z\mid {Y}_{\text{obs}}).\hfill \end{array}$$

(3.1)

Consider the case where *Z* is a discrete random variable/vector taking finite values on the conditional support (*Z*|*Y*_{obs}). For example, in (2.2), the changepoint *r* takes values in {1,…, *n*}; and in (2.3), the *s* changepoints (*r*_{1},…, *r _{s}*) take values in (

$$\mathcal{S}(Z\mid {Y}_{\text{obs}})=\mathcal{S}(Z\mid {Y}_{\text{obs}},\theta )=\{{z}_{1},\dots ,{z}_{K}\}.$$

Because of the discreteness of *Z*, the notation *f*(*z _{k}*|

$${q}_{k}={q}_{k}({\theta}_{0})=Pr\{Z={z}_{k}\mid {Y}_{\text{obs}},{\theta}_{0}\}/f({\theta}_{0}\mid {Y}_{\text{obs}},{z}_{k}),\phantom{\rule{0.38889em}{0ex}}k=1,\dots ,K.$$

(3.2)

As both *f*(*Z*|*Y*_{obs}, *θ*) and *f*(*θ*|*Y*_{obs}, *Z*) are available, the computation of (3.2) is straight-forward. Observing that all
${\{{q}_{k}\}}_{1}^{K}$ depend on *θ*_{0}, we denote *q _{k}* by

$${p}_{k}={q}_{k}({\theta}_{0})/{\mathrm{\sum}}_{{k}^{\prime}=1}^{K}{q}_{{k}^{\prime}}({\theta}_{0}),\phantom{\rule{0.38889em}{0ex}}k=1,\dots ,K,$$

(3.3)

where
${\{{p}_{k}\}}_{1}^{K}$ do not depend on *θ*_{0} since
${\{{p}_{k}\}}_{1}^{K}$ are normalizing probabilities of
${\{{q}_{k}\}}_{1}^{K}$. Thus, it is easy to sample from *f*(*Z*|*Y*_{obs}) since it is a discrete distribution with probabilities
${\{{p}_{k}\}}_{1}^{K}$ on
${\{{z}_{k}\}}_{1}^{K}$ (e.g., the built-in S-plus function “
`sample`” is especially designed for this purpose). We summarize the algorithm as follows.

Given both the complete-data posterior distribution *f*(*θ*|*Y*_{obs}, *Z*) and the conditional predictive distribution *f*(*Z*|*Y*_{obs}, *θ*),

- Generate iid samples ${\{{Z}^{(\ell )}\}}_{\ell =1}^{L}$ of
*Z*from the pmf*f*(*Z*|*Y*_{obs}) with probabilities ${\{{p}_{k}\}}_{1}^{K}$ on ${\{{z}_{k}\}}_{1}^{K}$; - Generate
*θ*^{()}~*f*(*θ*|*Y*_{obs},*Z*^{()}) for = 1,…,*L*, then ${\{{\theta}^{(\ell )}\}}_{1}^{L}$ are iid samples from the observed posterior distribution*f*(*θ*|*Y*_{obs}).

In this subsection, we provide two alternative formulae to calculate the marginal likelihood *m*(*Y*_{obs}). Let
${\{({Z}^{(\ell )},{\theta}^{(\ell )})\}}_{1}^{L}$ denote the output from the exact IBF sampling. From Bayes formula: *m*(*Y*_{obs}) = *L*(*Y*_{obs}|*θ*)*π*(*θ*)/*f*(*θ*|*Y*_{obs}), which holds for any *θ*, we have

$$logm({Y}_{\text{obs}})=logL({Y}_{\text{obs}}\mid {\theta}_{0})+log\pi ({\theta}_{0})-log\phantom{\rule{0.16667em}{0ex}}f({\theta}_{0}\mid {Y}_{\text{obs}}),\phantom{\rule{0.38889em}{0ex}}{\theta}_{0}\in \mathcal{S}(\theta \mid {Y}_{\text{obs}}).$$

(3.4)

For estimation efficiency, *θ*_{0} is generally taken to be a high-density point in the support of the posterior (e.g, the posterior mode/mean as suggested by Chib (1995)). Since the observed posterior density can be written as

$$f(\theta \mid {Y}_{\text{obs}})=\int f(\theta \mid {Y}_{\text{obs}},Z)f(Z\mid {Y}_{\text{obs}})dZ,$$

we obtain a Monte Carlo estimate of *f*(*θ*|*Y*_{obs}) at *θ*_{0},

$$\widehat{f}({\theta}_{0}\mid {Y}_{\text{obs}})=(1/L){\mathrm{\sum}}_{\ell =1}^{L}f({\theta}_{0}\mid {Y}_{\text{obs}},{Z}^{(\ell )}),$$

(3.5)

where {*Z*^{()}} are iid samples from *f*(*Z*|*Y*_{obs}). Note that this estimate is simulation consistent, i.e., (*θ*_{0}|*Y*_{obs}) → *f*(*θ*_{0}|*Y*_{obs}) as *L* → ∞. Combining (3.4) with (3.5), we have an approximate formula to calculate *m*(*Y*_{obs}).

On the other hand, note that *Z* is a discrete random variable taking values on
${\{{z}_{k}\}}_{1}^{K}$, using the point-wise IBF (Tan *et al*., 2003): *f*(*θ*|*Y*_{obs}) = {∫ *f*(*Z*|*Y*_{obs}, *θ*)/*f*(*θ*|*Y*_{obs}, *Z*) *dZ*}^{−1}, we explicitly have

$$f({\theta}_{0}\mid {Y}_{\text{obs}})={\{{\mathrm{\sum}}_{{k}^{\prime}=1}^{K}{q}_{{k}^{\prime}}({\theta}_{0})\}}^{-1}={p}_{1}/{q}_{1}({\theta}_{0}),$$

(3.6)

where *p _{k}* and

Let *M _{s}* represent a model with

$${\theta}_{1}\sim \text{Ga}({a}_{1},{b}_{1})\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\text{and}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}{\theta}_{2}\sim \text{Ga}({a}_{2},{b}_{2}),$$

(4.1)

where Ga(*a*, *b*) is a gamma distribution with density Ga(*x*|*a*, *b*) = *b ^{a}x^{a}*

$$f(r,{\theta}_{1},{\theta}_{2}\mid {Y}_{\text{obs}})\propto {\theta}_{1}^{{a}_{1}+{S}_{r}-1}{e}^{-({b}_{1}+r){\theta}_{1}}\xb7{\theta}_{2}^{{a}_{2}+{S}_{n}-{S}_{r}-1}{e}^{-({b}_{2}+n-r){\theta}_{2}},$$

where ${S}_{r}\equiv {\mathrm{\sum}}_{i=1}^{r}{y}_{i}$. Direct calculation yields

$$f({\theta}_{1},{\theta}_{2}\mid {Y}_{\text{obs}},r)=\text{Ga}({\theta}_{1}\mid {a}_{1}+{S}_{r},\phantom{\rule{0.38889em}{0ex}}{b}_{1}+r)\xb7\text{Ga}({\theta}_{2}\mid {a}_{2}+{S}_{n}-{S}_{r},\phantom{\rule{0.38889em}{0ex}}{b}_{2}+n-r),$$

(4.2)

$$f(r\mid {Y}_{\text{obs}},{\theta}_{1},{\theta}_{2})=\frac{{({\theta}_{1}/{\theta}_{2})}^{{S}_{r}}exp\{({\theta}_{2}-{\theta}_{1})r\}}{{\mathrm{\sum}}_{i=1}^{n}{({\theta}_{1}/{\theta}_{2})}^{{S}_{i}}exp\{({\theta}_{2}-{\theta}_{1})i\}},\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}r=1,\dots ,n.$$

(4.3)

We can treat the changepoint *r* as latent variable *Z* and (*θ*_{1}, *θ*_{2}) as parameter vector *θ*. By using (3.1)–(3.3), for any given (*θ*_{1,0}, *θ*_{2,0}) (*θ*_{1}, *θ*_{2}|*Y*_{obs}), we immediately obtain

$$f(r\mid {Y}_{\text{obs}})=\frac{\mathrm{\Gamma}({a}_{1}+{S}_{r})\mathrm{\Gamma}({a}_{2}+{S}_{n}-{S}_{r})/[{({b}_{1}+r)}^{{a}_{1}+{S}_{r}}{({b}_{2}+n-r)}^{{a}_{2}+{S}_{n}-{S}_{r}}]}{{\mathrm{\sum}}_{i=1}^{n}\mathrm{\Gamma}({a}_{1}+{S}_{i})\mathrm{\Gamma}({a}_{2}+{S}_{n}-{S}_{i})/[{({b}_{1}+i)}^{{a}_{1}+{S}_{i}}{({b}_{2}+n-i)}^{{a}_{2}+{S}_{n}-{S}_{i}}]},$$

(4.4)

where *r* = 1,…,*n*. It again confirms that the right-hand side of (4.4) does not depend on (*θ*_{1,0}, *θ*_{2,0}). Based on (4.4) and (4.2), we can obtain iid posterior samples for the changepoint *r* and the parameters (*θ*_{1}, *θ*_{2}) by using the exact IBF sampling.

Now we consider the multiple-changepoints model *M _{s}*. In (2.3), let

$${\theta}_{j}\sim \text{Ga}({a}_{j},{b}_{j}),\phantom{\rule{0.38889em}{0ex}}j=1,\dots ,s+1.$$

(4.5)

Thus, the joint posterior distribution (2.3) becomes

$$f(\mathit{r},\theta \mid {Y}_{\text{obs}})\propto {\mathrm{\Pi}}_{j=1}^{s+1}{\theta}_{j}^{{a}_{j}+{S}_{{r}_{j}}-{S}_{{r}_{j-1}}-1}{e}^{-({b}_{j}+{r}_{j}-{r}_{j-1}){\theta}_{j}},$$

(4.6)

where
${S}_{r}\equiv {\mathrm{\sum}}_{i=1}^{r}{y}_{i}$*, r*_{0} 0 and *r*_{s+1} *n*. From (4.6), we have

$$f(\theta \mid {Y}_{\text{obs}},\mathit{r})={\mathrm{\Pi}}_{j=1}^{s+1}\text{Ga}({\theta}_{j}\mid {a}_{j}+{S}_{{r}_{j}}-{S}_{{r}_{j-1}},{b}_{j}+{r}_{j}-{r}_{j-1}),$$

(4.7)

$$f(\mathit{r}\mid {Y}_{\text{obs}},\theta )\propto {\mathrm{\Pi}}_{j=1}^{s}{({\theta}_{j}/{\theta}_{j+1})}^{{S}_{{r}_{j}}}{e}^{({\theta}_{j+1}-{\theta}_{j}){r}_{j}},\phantom{\rule{0.38889em}{0ex}}\mathit{r}\in \mathcal{S}(\mathit{r}\mid {Y}_{\text{obs}}).$$

(4.8)

We treat ** r** as latent variables and

$$f(\mathit{r}\mid {Y}_{\text{obs}})\propto \prod _{j=1}^{s+1}\frac{\mathrm{\Gamma}({a}_{j}+{S}_{{r}_{j}}-{S}_{{r}_{j-1}})}{{({b}_{j}+{r}_{j}-{r}_{j-1})}^{{a}_{j}+{S}_{{r}_{j}}-{S}_{{r}_{j-1}}}},\phantom{\rule{0.38889em}{0ex}}\mathit{r}\in \mathcal{S}(\mathit{r}\mid {Y}_{\text{obs}}).$$

(4.9)

Based on (4.9) and (4.7), we can obtain iid posterior samples for the changepoints ** r** and the parameter vector

In practice, we are generally uncertain about the number of changepoints. Hence, model determination is the first task in changepoint analysis. Let *M _{s}* represent the Poisson model with

$$log\phantom{\rule{0.16667em}{0ex}}m({Y}_{\text{obs}}\mid {M}_{s})=log\phantom{\rule{0.16667em}{0ex}}L({Y}_{\text{obs}}\mid \widehat{\mathrm{\Theta}},{M}_{s})+log\phantom{\rule{0.16667em}{0ex}}\pi (\widehat{\mathrm{\Theta}}\mid {M}_{s})-log\phantom{\rule{0.16667em}{0ex}}f(\widehat{\mathrm{\Theta}}\mid {Y}_{\text{obs}},{M}_{s}),$$

(4.10)

where *f*(|*Y*_{obs},*M _{s}*) =

$${B}_{s,s+1}=\frac{m({Y}_{\text{obs}}\mid {M}_{s})}{m({Y}_{\text{obs}}\mid {M}_{s+1})}.$$

(4.11)

Jeffreys (1961, Appendix B) suggested interpreting *B _{s}*,

We first analyze the Birmingham data in Table 1. Denote the number of cases of HUS in Birmingham in year *i* by *y _{i}* (

Under *M*_{1}, we assume that
${y}_{1},\dots ,{y}_{r}\stackrel{\text{iid}}{\sim}\text{Poisson}({\theta}_{1})$ and
${y}_{r+1},\dots ,{y}_{n}\stackrel{\text{iid}}{\sim}\text{Poisson}({\theta}_{2})$, where *r* is the unknown changepoint and *θ*_{1} ≠ *θ*_{2}. Table 2 contains the exact posterior probabilities for the changepoint *r* using (4.4). The changepoint occurs at *r* = 11 (i.e., year 1980) with posterior probability 0.9795. Based on (4.4) and (4.2), we generate 20, 000 iid posterior samples by using the exact IBF sampling, and the Bayes estimates of *r*, *θ*_{1} and *θ*_{2} are given by 11.013, 1.593 and 9.609. The corresponding Bayes standard errors are 0.143, 0.370 and 0.985. The 95% Bayes credible intervals for *θ*_{1} and *θ*_{2} are [0.952, 2.393] and [7.800, 11.621], respectively. Figures 2(a) and 2(b) show the histogram of the changepoint *r* and the posterior densities of *θ*_{1} and *θ*_{2}, which are estimated by a kernel density smoother based on iid posterior samples. Figure 2(c) depicts the annual numbers of HUS for Birmingham series, the identified changepoint position, and the average number of cases before and after the changepoint.

Birmingham data set. (a) Histogram of the changepoint *r*. (b) The posterior densities of *θ*_{1} and *θ*_{2} estimated by a kernel density smoother based on 20, 000 iid samples generated via the exact IBF sampling. (c) The annual numbers of cases **...**

Now we analyze the Newcastle data in Table 1. Similarly, three log-marginal likelihoods are given by log*m*(*Y*_{obs}|*M*_{0}) = −85.24, log*m*(*Y*_{obs}|*M*_{1}) = −64.13 and log*m*(*Y*_{obs}|*M*_{2}) = −64.10. From (4.11), the Bayes factor for *M*_{2} versus *M*_{0} is 1.5169 × 10^{9}, and the Bayes factor for *M*_{2} versus *M*_{1} is 1.03. Therefore, we select *M*_{2}, which is consistent with the pattern as indicated in Figure 1. In addition, the selection of *M*_{2} is also identical to that obtained by Henderson and Matthews (1993).

Under *M*_{2}, we assume that
${y}_{1},\dots ,{y}_{{r}_{1}}\stackrel{\text{iid}}{\sim}\text{Poisson}({\theta}_{1}),{y}_{{r}_{1}+1},\dots ,{y}_{{r}_{2}}\stackrel{\text{iid}}{\sim}\text{Poisson}({\theta}_{2})$, and
${y}_{{r}_{2}+1},\dots ,{y}_{n}\stackrel{\text{iid}}{\sim}\text{Poisson}({\theta}_{3})$, where (*r*_{1}, *r*_{2}) are the unknown changepoints and *θ*_{1} ≠ *θ*_{2} ≠ *θ*_{3}. Using the standard exponential prior distributions, specified by letting *a _{j}* =

The first simulated dataset consists of 100 observations with
${y}_{1},\dots ,{y}_{50}\stackrel{\text{iid}}{\sim}\text{Poisson}(3)$ and
${y}_{51},\dots ,{y}_{100}\stackrel{\text{iid}}{\sim}\text{Poisson}(0.5)$. The simulated observations are shown in Figure 4(c). We again use standard exponential distributions as priors of *θ*. From (4.10), log-marginal likelihoods for three models *M*_{0}, *M*_{1} and *M*_{2} are given by log*m*(*Y*_{obs}|*M*_{0}) = −187.1, log*m*(*Y*_{obs}|*M*_{1}) = −148.8 and log*m*(*Y*_{obs}|*M*_{2}) = −149.3. From (4.11), we have *B*_{10} = 4.3 × 10^{16} and *B*_{12} = 1.649, which suggest that *M*_{1} is appropriate. Computations according to (4.4) show that the changepoint occurs at *r* = 50 with posterior probability 0.807. Based on (4.4) and (4.2), we generate 30, 000 iid posterior samples by using the exact IBF sampling. The Bayes means, standard errors, and 95% credible intervals for (*r*, *θ*_{1}, *θ*_{2}) are given by (50.6, 2.8226, 0.5249), (1.642, 0.240, 0.103) and [50, 56], [2.373, 3.315], [0.341, 0.742], respectively. Figures 4(a) and 4(b) show the histogram of *r* and the posterior densities of *θ*_{1} and *θ*_{2}. Figure 4(c) displays the 100 simulated observations, the identified changepoint position, and the Bayes estimates of *θ*_{1} and *θ*_{2}.

Simulated dataset with one changepoint. (a) Histogram of *r*. (b) The posterior densities of *θ*_{1} and *θ*_{2}. (c) The 100 simulated observations. The dotted vertical line denotes the identified changepoint position (*r* = 50), the left horizontal **...**

The second simulated dataset consists of 100 observations:
${y}_{1},\dots ,{y}_{20}\stackrel{\text{iid}}{\sim}\text{Poisson}(5.5),{y}_{21},\dots ,{y}_{70}\stackrel{\text{iid}}{\sim}\text{Poisson}(0.8)$, and
${y}_{71},\dots ,{y}_{100}\stackrel{\text{iid}}{\sim}\text{Poisson}(3.5)$. The simulated observations are shown in Figure 5(d). Similarly, we have log*m*(*Y*_{obs}|*M*_{0}) = −249.7, log*m*(*Y*_{obs}|*M*_{1}) = −222.2 and log*m*(*Y*_{obs}|*M*_{2}) = −185.6, *B*_{20} = 6.891 × 10^{27}, and *B*_{21} = 7.856 ×, which suggest that *M*_{2} is appropriate. Computations according to (4.9) show that the changepoints occur at *r*_{1} = 20 and *r*_{2} = 70 with the joint posterior probability 0.7912. Based on (4.9) and (4.7), we generate 30, 000 iid posterior samples by using the exact IBF sampling. The Bayes means, standard errors, and 95% credible intervals for (*r*_{1}, *r*_{2}, *θ*_{1}, *θ*_{2}, *θ*_{3}) are given by (20.0091, 69.7871, 5.7120, 0.8427, 3.8190), (0.1019, 0.4457, 0.5230, 0.1296, 0.3517) and [20, 20], [69, 70], [4.735, 6.789], [0.610, 1.116], [3.161, 4.537], respectively. Figures 5(a) and 4(b) show the histogram of *r*_{1} and *r*_{2}. Figures 5(c) shows the posterior densities of *θ*_{1}, *θ*_{2} and *θ*_{3}. Figure 5(d) displays the 100 simulated observations, the two identified changepoint positions, and the Bayes estimates of *θ*_{1}, *θ*_{2} and *θ*_{3}.

It is noted that Barry and Hartigan (1992, 1993) and Fearnhead (2006) describe methods for calculating marginal likelihoods for multiple change-point problems. The latter also discusses methods for simulating from the change-point positions. Barry and Hartigan (1992, 1993) assumed a specific prior structure on the number and position of changepoints; while Fearnhead (2006) considered models with a fixed number of change-points. Although it was claimed that these methods can deal with arbitrarily large numbers of change-points, these methods are quite complicated in implementation. For example, the parameter values may be estimated exactly in *O*(*n*^{3}) calculations, or to an adequate approximation by MCMC methods that are *O*(*n*) in the number of observations.

In this paper, we considered Poisson changepoint analysis by using an exact IBF sampling approach. The advantages of the proposed exact IBF sampling method over MCMC methods are that: (i) there is no requirement to diagnose whether the MCMC algorithms has converged, i.e., the former entirely avoids the problems of convergence and slow convergence associated with MCMC methods; (ii) because the samples generated from the observed posterior distribution are independent it is straightforward to quantify uncertainty in estimates of features of the posterior distributions based on them. To determine the number of changepoints, we developed simple methods to exactly calculate marginal likelihood (or Bayes factor). Two simulations are conducted to validate the performance of the proposed methods.

We should point out that the proposed approach is limited. For example, let *M _{s}* represent a model with

In the re-analysis of the HUS data using the proposed methods, we have focused on the annual numbers of cases rather than the incidence of the HUS because accurate population are difficult to obtain for the catchment areas. Possibilities for further analysis might consider extra-Poisson variation, treads in mean, the influence of some covariates (e.g., age, race) and so on.

We are grateful to the Editor, an Associate Editor and three referees for their constructive comments and suggestions. GL Tian and M Tan’s research was supported in part by U.S. National Cancer Institute grants CA106767 and CA119758. The research of KW Ng was partially supported by a research grant of the University of Hong Kong. Special thanks should go to one referee for drawing our attention to several recent papers on multiple change-point problems.

**Publisher's Disclaimer: **This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errorsmaybe discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

- Arnold SF. Gibbs sampler. In: Rao CR, editor. Handbook of Statistics. Vol. 9. Elsevier Science Publishers B. V; 1993. pp. 599–625.
- Barry D, Hartigan JA. Product partition models for change point problems. Ann Statist. 1992;20:260–279.
- Barry D, Hartigan JA. A Bayesian analysis for change point problems. Journal of the American Statistical Association. 1993;88:309–319.
- Brodsky BE, Darkhovsky BS. Series: Mathematics and Its Applications. Vol. 243. Springer; New York: 1993. Nonparametric Methods in Change Point Problems.
- Carlin BP, Gelfand AE, Smith AFM. Hierarchical Bayesian analysis of changepoint problems. Appl Statist. 1992;41:389–405.
- Chen MH. Computing marginal likelihoods from a single MCMC output. Statistica Neerlandica. 2005;59:16–29.
- Chen J, Gupta AK. Parametric Statistical Change Point Analysis. Springer; New York: 2000.
- Chib S. Marginal likelihood from the Gibbs output. Journal of the American Statistical Association. 1995;90:1313–1321.
- Chib S. Estimation and comparison of multiple change-point models. Journal of Econometrics. 1998;86:221–241.
- Evans M, Swartz T. Methods for approximating integrals in statistics with special emphasis on Bayesian integration problems (with discussions) Statist Sci. 1995;10:254–272.
- Evans M, Swartz T. Approximating Integrals via Monte Carlo and Deterministic Methods. Oxford University Press; Oxford: 2000.
- Fearnhead P. Exact and efficient inference for multiple changepoint problems. Statist Comput. 2006;16:203–213.
- Fearnhead P, Liu Z. On-line inference for multiple changepoint problems. Journal of the Royal Statistical Society, Series B. 2007;64:589–605.
- Gelfand AE, Dey DK. Bayesian model choice: asymptotics and exact calculations. Journal of the Royal Statistical Society, Series B. 1994;56:501–514.
- Halpern AL. Minimally selected
*p*and other tests for a single abrupt changepoint in a binary sequence. Biometrics. 1999;55:1044–1050. [PubMed] - Henderson R, Matthews JNS. An investigation of changepoints in the annual number of cases of haemolytic uraemic syndrome. Appl Statist. 1993;42:461– 471.
- Hobert JP, Casella G. The effect of improper priors on Gibbs sampling in hierarchical linear models. J Am Statist Assoc. 1996;91:1461–1473.
- Jarrett RG. A note on the intervals between coal-mining disasters. Biometrika. 1979;66:191–193.
- Jeffreys H. Theory of Probability. 3. Oxford: Oxford University Press; 1961.
- Kass RE, Raftery AE. Bayes factors. Journal of the American Statistical Association. 1995;90:773–795.
- Maguire BA, Pearson ES, Wynn AHA. The time intervals between industrial accidents. Biometrika. 1952;38:168–180.
- Milford DV, Taylor CM, Gutteridge B, Hall SM, Rowe B, Kleanthous H. Haemolytic uraemic syndromes in the British Isles 1985–8: association with verocytotoxin producing Escherichia coli. Part 1: Clinical and epidemiological aspects. Arch Dis Child. 1990;65(7):716–721. [PMC free article] [PubMed]
- Raftery AE, Akman VE. Bayesian analysis of a Poisson process with a change-point. Biometrika. 1986;73:85–89.
- Siegmund D. Confidence sets in change point problems. Int Statist Rev. 1988;56:31–48.
- Smith AFM. A Bayesian approach to inference about a change-point in a sequence of random variables. Biometrika. 1975;62:407–416.
- Smith AFM. Change-point problems: Approaches and applications. In: Bernardo JM, DeGroot MH, Lindley DV, Smith AFM, editors. Bayesian Statistics. Vol. 1. Valencia: Valencia University Press; 1980. pp. 83–89.
- Smith AFM, Cook DG. Straight lines with a change-point: A Bayesian analysis of some renal transplant data. Appl Statist. 1980;29:180–189.
- Stephens DA. Bayesian retrospective multiple-changepoint identification. Appl Statist. 1994;43:159–178.
- Tan M, Tian GL, Ng KW. A noniterative sampling method for computing posteriors in the structure of EM-type algorithms. Statistica Sinica. 2003;13:625–639.
- Tarr PI, Neill MA, Allen J, Siccardi CJ, Watkins SL, Hickman RO. The increasing incidence of the hemolytic-uremic syndrome in King County, Washington: lack of evidence for ascertainment bias. Am J Epidemiol. 1989;129(3):582–586. [PubMed]
- Worsley KJ. Confidence regions and tests for a change-point in a sequence of exponential family random variables. Biometrika. 1986;73:91–104.
- Wu YH. Series: Lecture Notes in Statistics. Vol. 180. Springer; New York: 2005. Inference for Change Point and Post Change Means After a CUSUM Test.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |