Funnel plots are now commonly used tools for the identification of health care providers with potentially outlying performance. In the case of the SMR, funnel plots have the convenience of allowing the SMR from an individual provider to be plotted on a graph where the control limits have been pre-drawn. Their interpretation is, at first sight, also straightforward: the observed SMR for a provider whose underlying performance matches the ‘target’ will fall outside the control limits with a known (nominal) probability. However, as has been shown in this paper the true probability of falling outside of the limits does not always match this nominal value. Two reasons for this mismatch were investigated here: 1) the use of different methods to construct the control limits; 2) the effect of discrete outcomes in preventing the specification of exact probabilities.
Three commonly used methods based on the Poisson distribution have been investigated for 95% and 99.8% control limits for funnel plots of the SMR. Two of these methods were based on confidence intervals and the third was the prediction interval derived using the Poisson cumulative probability distribution. The methods produced different control limits and different probabilities for an ‘in control’ unit to fall outside of these limits. The probability of a provider being identified as a potential outlier is dependent, therefore, on the method used to calculate the control limits.
Whilst no one method performs well for all values of λ (the expected number of events), the ‘exact’ confidence interval method performed particularly poorly and should be avoided if a probability close to the nominal value is desired. The probability of the observed outcome from an ‘in control’ institution falling outside of the limits of ‘exact’ confidence interval can be quite different from the assumed nominal values. For example, if the expected number of events is between 1 and 50 the median probability of an ‘in control’ institution falling above the upper limit of a 95% control interval and, hence, being identified as a potential outlier, is 0.012 instead of 0.025: i.e. less than half the presumed probability. Often with SMRs very small numbers of events occur and, therefore, the potential for being identified by this method is decreased particularly when λ is small.
It is also important to consider the properties of the method used when interpreting any results or limits produced. Although confidence intervals are often more familiar to the reader, a disadvantage of their use in this context is that they are often interpreted incorrectly.
Probability-based prediction intervals allow a more straightforward interpretation of control limits. In this paper they were defined so that the probability of falling outside a control limit was always less than, or equal to, the nominal probability: for example, the probability of falling above the upper limit of 95% control limits is always less than, or equal to, 0.025. However, the control limits could equally have been derived so that the probability of an observation from an ‘in control’ provider falling outside of the limits was at least
equal to the nominal value or, indeed, some combination of the two approaches to obtain a value that produced a probability closest to the nominal value [6
]. The decision of which of these options to use will depend on various factors, including the clinical question of interest. However, the important point is that if the control limits are obtained from probability-based prediction intervals then this property of the limits can be specified a priori. This cannot be done if the control limits are based on confidence intervals.
Funnel plots can be used to answer questions other than just “Which providers’ results are not compatible with the target” [23
]. While investigating alternative approaches is beyond the scope of this paper, the same principle applies that only the use of prediction intervals can produce control limits with probability properties specified a priori.
It also seems appropriate that there is a need for the limits to be symmetrical, that is have the same properties for falling above the upper control limit as falling below the lower control limit. The Stata function FUNNELCOMPAR, for example, has asymmetrical tails in that the probability of an observation falling below the lower limit is always less than, or equal to, the nominal value (i.e. P(X
α/2) whereas the probability of an observation falling above the upper limit is always at least the nominal value (i.e. P(X
]. Such asymmetry makes the funnel plots difficult to interpret.
It could be argued that any control limits are always only approximate given the uncertainties in the data, any statistical modelling, the target, etc. However, funnel plot limits continue to be used for identification of potentially poorly performing institutions in order to initiate further investigations. Therefore a full and correct understanding of funnel plots is needed in order to avoid the unnecessary investigation of ‘in control’ providers or the failure to investigate the true outliers. Such investigations can have important consequences in themselves whether the provider is ultimately deemed to be a true outlier or not.
In this paper 95% and 99.8% control limits were investigated as these are the limits most commonly used for monitoring health care providers. These particular control limits are unlikely to be optimal in all circumstances and careful consideration should always be given to the choice of limits. However, the properties of the potential methods to calculate the limits described in this paper are likely to hold whatever limits are selected.