Previous studies on diabetes incidence for Canadian adults from administrative data relied on the NDSS (National Diabetes Surveillance System) case definition and employed a free observation period to remove the prevalent pool effect [
1,
6]. In our study, the retrograde survival function showed that a five-year clearance period is a reliable clearance period to distinguish new cases in a prevalent pool. Even if the kappa agreement is excellent after four years we did proceed with retrograde survival function to take into account the high prevalence of patients with diabetes in the database.
Diabetes is an insidious disease which may go undiagnosed for many years. In this retrograde method, we started with the diabetes cases who already met the criteria of NDSS in the year 2002. Canadian studies show the NDSS case definition has a sensitivity above 86%; specificity around 97%; and positive predictive values of 80% [
1,
6]. We aimed to eliminate prevalent cases from incident cases identified by the NDSS case definition so that high sensitivity and less selection bias were of main interest. According to our study, there was a significant difference between the NDSS method and the one hit method (P < 0.0001) to identify diabetes incident cases. The difference is 1.4% (3788) for a 5-year clearance period using one hit method and 1.8% (4844) for a 10-year clearance period for the NDSS method. This finding confirms those of previous studies [
6]. As reported by Wilson, the use of more than one ICD-9 code 250 to find diabetes cases in an administrative database for a three- year period results in significant loss of sensitivity, and specificity of a single ICD-9 code 250 is nearly the same as the use of two ICD-9 codes 250 [
16]. On the other hand this result could be interpreted within the context of the study's design. A large administrative dataset, provides sufficient sample size to observe relations, and the more restricted definition for NDSS method (i.e. Two physician visits or one hospitalization within two years) than the one hit method thus for choosing between two methods, considering both work similarly, an easier method which is slightly more sensitive is superior to a more difficult method which is somewhat less sensitive.
In practice, it is usual to exclude patients who received the same diagnosis several years before the beginning of the study period [
6,
1,
17]. Previous studies used different cut-off points for clearance periods based on the length of registration years in the databanks [
6,
1,
17]. In our study, we used the kappa agreement method [
13] and retrograde survival function to validate the optimal clearance period for determination of diabetes incidence from administrative data.
We found excellent agreement (kappa > 0.90) between a 10-year clearance period and a clearance period of 4 years or more. However such a high agreement (kappa > 0.90) may be related to the high prevalence of cases in our database [
13]. For this reason, we continued our analysis by retrograde survival function. This method showed that five years is an appropriate duration for a clearance period and the observed risk of being a repeated case remained constant thereafter.
The accuracy of our incidence estimation may also be affected by validity of diagnosis codes recorded in administrative databases and time. Canadian studies on administrative data showed that there is a good agreement between recorded cases in administrative databases and self-reported disease [
11,
18]. To evaluate the effect of the cohort of index time (2002 in our study) on the chance of being a preceding case, we evaluated reproducibility of a 10-year vs. 5-year clearance period for different index years including 1999, 2000. We found kappa > 0.90 in all cases (data not shown).
We also evaluated the consistency of a five-year clearance period when different selection criteria were used. Kappa were > 0.90 between a clearance period of 10 years vs. a clearance period of 4 years or more whether one hit method or NDSS method were used.
Our study has some limitations. We used medico-administrative data to find the incident cases. This may underestimate the real incidence because most people with diabetes remain asymptomatic for years after onset of the disease and because only patients who received health care services are entered in these databases [
19,
20]. In Canada with universal insurance coverage, every encounter with the health system is recorded in medico-administrative registries. To ensure patients encountered the health services during the clearance period (of any diagnosis) we examined the 37473 incident cases of diabetes for records related to other services and only 1301 (3.5%) had no record related to another kind of service. There are three possible conditions that could result in non-detection of potential cases of diabetes during the study period:
1-People who did not need the services; 2-diabetic patients who used services for other reasons; 3-sick people who left the country.
One study in Manitoba showed the probability of 0.96 for diabetic patients to have a subsequent medical contact for the diabetes within two years [
1]. A validation study using patient records and administrative data in Ontario showed sensitivity for diabetes was 90 and 86%; for the 1- and 2-claim algorithms, specificity was 92 and 97%, respectively, and positive predictive values were 61 and 80%, negative predictive values were 99 and 98%, respectively [
6]. On the basis of Canadian census data on out-of-country and out-of-province migrations, we estimated that fewer than 5% of these patients would have migrated in or out of Quebec during the study period (details available from Statistics Canada). Therefore the missed data was unlikely to substantially affect the reported results.
In our database, it was not possible to differentiate between diabetes type 1 and type 2.
The results may not be representative for the younger populations with diabetes because patterns and temporal evolution of disease is different between diabetes type 1 and type 2 and incidence of the former is higher in young populations. Furthermore, we did not assess clearance periods of more than ten years. However the results of our retrograde survival function analyses showed stabilization of risk of being a previous case after a clearance period of 5 or more years.