Statistical process control (SPC) is an approach to quality improvement that has seen increasing use in healthcare since the early 1990s [1
]. Originated by Shewhart [2
], SPC provides analytical tools to understand the variation displayed by measures of quality, and an approach to taking action on the resulting information with a view to making improvements. The control charts that form the mainstay of SPC analysis provide a simple graphical approach to understanding variation. Following Shewhart’s initial work, SPC was subsequently developed by Shewhart and Deming [3
], and substantial literature now exists, including on the application of SPC in healthcare [4
The most natural application of SPC in healthcare is to time series data - the natural ordering of the data in time is central to the correct application of the analysis. However, in the SPC literature there are conflicting opinions on the usage of control charts for data that does not come endowed with a natural ordering. Some authors recommend the use of control charts for such data [4
], and some have used this analysis, for example in comparing hazard ratios for specific mortality rates [8
]. Other authors argue against the use of control charts in such situations [9
]. In this article we resolve objectively and quantitatively the question of whether it is acceptable to use control charts to analyse variation in data that does not have a natural ordering.
As all measured data exhibit variation, the idea behind a control chart is to provide concrete rules to assess the likely nature of the observed variation. Broadly speaking, the observed variation is classified as either “common cause” variation or “special cause” variation. Common cause variation is the variation exhibited by a process in its usual state, whereas special cause variation is caused by an exceptional or external event. The rules are couched in terms of a number of horizontal lines - the process (or control) limits - marked on a line graph of the data. The calculations of these features depend on which type of chart one is using in the SPC setting, and there are several available. The most commonly used are the individual values and moving range (XmR) charts; p-charts (used to monitor the proportion of faults in a sample); np-charts (an adaptation of the p-chart used to interpret performance in numbers of units rather than proportion); c-charts (to monitor count data - number of faults per unit - or to monitor the total number of events occurring over certain unit of time); and u-charts (for monitoring count data with the sample size greater than one, i.e. the average number of faults per unit). The XmR chart is one of the simplest of the charts to construct, and yet also one of the more robust in general practice, as the other charts rely on the data conforming to an assumed distribution. P- and np-charts rely on the binomial distribution; whilst c- and u-charts rely on the Poisson distribution. The XmR chart makes no such assumption and instead uses the data themselves to provide empirical limits through calculation of an average moving range; whilst, for example, the p- and np-charts assume the variation to be a function of the location and plot theoretical limits that will not hold if the binomial assumption is violated. Technical details of the different types of control chart and the relevant assumptions can be widely found, e.g.
In healthcare, more so than in the manufacturing birthplace of SPC, we will seldom be in a position to justify stringent assumptions, such as those of the binomial model, satisfactorily. The simplest control chart, the XmR chart also has the distinct advantage of having the least stringent assumptions attached to it. In fact the only assumption required is that a rational sampling and sub-grouping regime is used [11
]. In this sense rational means taking into account the context for the data, sources of variation, and the questions to be addressed by the charts. Thus in the complex real world of healthcare, the robustness of the XmR chart to distribution of the data is invaluable. Furthermore, even if the assumptions of a specific model do hold, in most cases the XmR chart will yield identical results to the more restrictive chart [4
]. With this in mind, for the rest of this article we will focus on the XmR chart.
For the XmR analysis of data with a natural ordering, it is important that global measures of dispersion, such as the overall standard deviation, are not used to in the calculation of control limits [12
]. This is because such a global measure only makes sense in the context of an assumption that the data is homogeneous; whilst the primary question that the control chart is designed to answer is precisely this: is the data homogeneous, or are there signals within of heterogeneity – “special causes”. Instead the correct method for calculating the control limits for an XmR chart is via the average moving range [2
]. This subtle distinction is of fundamental importance in the correct application of the methodology of SPC.
Whilst the XmR chart was originally formulated with time-series data in mind, its use has been advocated for data in which there is logical comparability but no inherent ordering of the data, provided the order in which the data is placed is not determined by the data themselves [4
]. In this article we will explore quantitatively the consequences of the lack of natural ordering for the average moving range, both theoretically and via an example using real world data. We then discuss alternative approaches to the detection of special causes for data without a natural order.