In this study we showed how routinely collected data from several general practitioners registration networks can be combined to estimate incidence and prevalence for 18 chronic diseases. Uncertainty around estimates of incidence and prevalence was looked at from different angles. First, we estimated confidence intervals both with and without the between GPRN variance. Next, we calculated a stochastic ranking of diseases in terms of incidences and prevalences. This was done using Monte Carlo simulation, which consisted of randomly drawing from the estimated probability distributions of the disease prevalences and incidences, each time obtaining a particular ordering. After repeating this process many times, a distribution of rankings emerges that provides insight into the likelihood of orderings different from the "standard" ordering based on the point estimates. Our simulations showed that the presence of wide confidence intervals for some individual diseases also influences the rank of other diseases with smaller confidence intervals. In this respect, the results of our study showed that league tables of diseases should be interpreted with caution as the standard (point estimates) ordering may not reflect the "real" ordering accurately. These findings obviously have relevance for public health policy and the monitoring of chronic diseases. Furthermore, our findings beg the need for effective tools to communicate uncertainty to policy makers.

It should be noted that in this study we only presented a ranking of a small set of diseases. Increasing the number of diseases would probably imply more overlap in confidence intervals and, therefore, more uncertainty in the ranking. Furthermore, in constructing the stochastic ranking we assumed independence between diseases. However, some diseases are causally related (e.g. diabetes and coronary heart disease) or share common risk factors such as smoking and body mass index. If the dependency between diseases could be quantified the stochastic ranking of diseases might again turn out to be different.

A drawback of our study is that we only used secondary data at the GPRN level that did not include demographic characteristics besides age and sex and that we could not assess disease classifications errors within our data set [

26-

28]. We did not have data at the GP practice level and could not verify how accurately GP's establish their diagnoses and code them into their computerised files. We can only hope that they maintain their professional standards and adhere to the professional standards as set forth in guidelines. Given these limitations regarding the crude data of our study, we employed methods of analysis that were chosen to take these potential sources of bias into account. The major challenge was to combine the data while doing justice to the heterogeneity between the different registries. This was achieved by modelling the data in a hierarchical/multi-level fashion using generalized linear models, in which the registries were modelled as a random intercept. Since the goal of our study was explicitly to focus on the typical (in our case average) GPRN instead of average subject we have chosen a random effects model over a GEE model [

29]. This assumes that there is such thing as an "average" practice, which can be used to extrapolate estimates valid for the population. An additional advantage of the random effects model is that it provided an estimate of the variability between GPRNs. We have presented this variability graphically and have incorporated it in our simulations and saw that especially for incidence the between GPRN variance was considerable. This could suggest that GPRNs might differ in the distinction between 'new' and 'old' cases.

There are several ways to estimate morbidity rates in a population in a cross-sectional manner [

28]. Examples are health interview surveys and health examination surveys. It is obvious that health interview surveys have inherent limitations as they rely on self-report. Hospital-based data, although readily available in many countries, have the important disadvantage of selection bias as these include only those cases that lead to hospitalization. Information gathered in general practices, on the other hand, does not have these drawbacks. In countries in which the majority of the population is registered with a GP, such as the UK and the Netherlands, a source is available that could, in principle, be used to derive reliable, and representative estimates of descriptors of national health [

30-

32]. In the best of all possible worlds, all GPs would diagnose and document disease (episodes) in a uniform manner, reporting them comprehensively to a central databank. However, in reality there are important obstacles to drawing from this well in a sound manner. Firstly, there is the limited availability of data, because a national system of data collection for primary health care does not exist in most countries. Fortunately, several GPRNs have been started in the Netherlands over the past decades, which collect and share their morbidity data on a regular and structured basis. Secondly, diagnostic criteria and procedures are not always clear-cut and unambiguous, leaving room for differences in "case finding" and case ascertainment [

28]. Under such conditions, differences between GPRNs (as captured in our models by the random intercept) can lead to big differences in estimates of prevalence and incidence. We hypothesize that for diseases for which uncertainty intervals are wide (such as arthritis), differences in case finding between GPRNs may offer an explanation for our findings [

28]. Furthermore, differences in registration length between GPRNs might explain the between GPRN variance for some diseases because the longer the registration period the lower the probability that a prevalent case is misclassified as incident [

33].

Concluding, estimates of incidence and prevalence can be obtained by combining data from different GPRNs, but confidence intervals must be considered. Monte Carlo simulation techniques can be utilized to assess uncertainty in the relative rankings of diseases. League tables of diseases based on point estimates should be interpreted with caution.