This is a large scale historical cohort study of all adult individuals insured by Clalit Health Services, the largest health maintenance organization in Israel, insuring and providing healthcare to 55% of the Israeli population (about 3.9 million people). Study entry is defined as January 1, 2002. Inclusion criteria will be: age 21 years or older at study entry and continuous insurance in Clalit Health Services from at least one year prior to study entry, January 1 2001, until the end of the study period, December 31, 2011, or until death, for those who do not survive until the end of follow-up. Of the population of over 2,300,000, over 110,000 had diabetes at study entry; and an additional 350,000 were diagnosed with diabetes during the study period (January 2002 to December 2011).
This study has three Specific Aims: First, we will compare rates of cancer incidence (overall and specific cancers, including second primary neoplasms) in individuals with and without diabetes (separately for prevalent and incident cases of diabetes). Second, we will investigate, in individuals with diabetes, the associations between measures of glucose control, as assessed by HbA1c and fasting plasma glucose, and cancer incidence (overall and specific cancers, including second primary neoplasms). Third, we will compare the incidence of cancer (overall and specific cancers, including second primary neoplasms), among individuals with diabetes who used specific glucose-lowering medications, as well as among patients with no drug exposure (periods of time during which individuals classified with diabetes did not yet purchase glucose-lowering medications, including periods on diet and exercise regimen only).
Four study groups will be established according to the status of diabetes and cancer at study entry, Jan 1, 2002 (Figure
): cancer free, diabetes free (CF-DF); cancer free, diabetes prevalent (CF-DP); cancer prevalent, diabetes free (CP-DF); and cancer prevalent, diabetes prevalent (CP-DP). Individuals free of diabetes at study entry will be followed for diabetes incidence, and all four groups will be followed for cancer incidence (second primaries for those with prevalent cancer). Nine groups, according to diabetes and cancer prevalence at study entry, and diabetes and cancer and incidence during the study period, will be analyzed (Figure
Figure 1 The ten year follow-up of the four study groups, according to disease status at study entry: diabetes free (DF) or prevalent (DP), and cancer free (CF) or prevalent (CP). For the groups that are DF at study entry, diabetes incidence (DI) will be followed (more ...)
The study cohort according to diabetes and cancer status during the ten year follow-up period. DP - prevalent diabetes; DI - incident diabetes; DF - diabetes free; CP: prevalent cancer; CI - incident cancer; CF - cancer free.
Prevalent diabetes was determined according to physician reports to the chronic disease registry of Clalit Health Services. Incident diabetes will be determined by any one of the following criteria: two fasting glucose readings of 126 mg/dl or above during the course of a year, HbA1c greater than 7.0%, 3 recorded purchases of glucose-lowering medications during the course of a year, diagnosis of diabetes according to the Chronic Disease Registry of Clalit Health Services, or a written diagnosis by a physician (in the community or hospital). The relatively low number of individuals classified with diabetes at study entry (prevalent diabetes) is apparently due to the more limited definition of diabetes used by Clalit Health Services prior to 2002.
Exposure to glucose-lowering medications
In addition to the exposure to diabetes per se, individuals with diabetes will be studied for their exposure to glucose-lowering medications. A drug exposure will be defined as a minimum of 3 purchases of glucose-lowering medications during one year. Each drug exposure will define one type of treatment or a combination of treatments. An individual may be analyzed according to more than one exposure during follow-up time. Drug exposures to be investigated are: each insulin type, meglitinide derivatives, sulfonylureas, biguanides, alpha glucosidase inhibitors, thiazolidinediones, and incretins. An additional group (“no glucose-lowering medication”) will comprise periods of time during which individuals classified with diabetes did not yet purchase glucose-lowering medications.
For each glucose-lowering medication, cumulative exposure will be assessed as the total number of monthly dosages purchased. Daily dosages and daily dosages per body weight will be estimated from records of purchases. Time dependent regression models for the outcomes investigated will account for glucose-lowering medications as they are added and changed over the follow up period, adjusting for constant and time-dependent covariates.
The following variables will be included together with the time they were recorded:
1. Age, gender, BMI, type of medical insurance (basic, or basic and supplementary).
2. Smoking status (non-smoker, former smoker, current smoker)
3. Concomitant medications (types, exposure rates, and cumulative exposures), including antihypertensive, lipid lowering, beta-blockers, angiotensin converting enzyme (ACE) inhibitors
4. Comorbidities defined according to the Clalit Registry of Chronic Diseases, with focus on those associated with cancer risk (e.g. IBD, COPD, gallstones)
5. Blood pressure (mean annual values)
6. Reproductive factors: hormone replacement therapy, fertility treatments
7. Biochemical characteristics (first and last tests during follow-up, mean annual values, and number of blood tests performed annually): serum lipids (total cholesterol, triglycerides, HDL-cholesterol, VLDL-cholesterol), liver enzymes, creatinine, and urine protein
8. Clinical examinations (average annual number of visits to the treating physician during follow-up)
9. Cancer screening tests: mammography, fecal occult blood, colonoscopy, PAP smear, colposcopy, prostate screening (PSA, ultrasound)
10. Medical procedures associated with cancer risk (e.g. cholecystectomy).
We note that data on some key potential confounders (like smoking and fertility treatments) are available for only a subset of the cohort. We will conduct sensitivity analyses in these subsets to assess whether these factors indeed confound the associations between diabetes, glucose-lowering medications, and cancer.
Incidence of cancer (both first and second primary neoplasms) will be the primary outcome. For diagnoses of cancer, the following data from the Israel National Cancer Registry will be linked to the study file: date of diagnosis; place of diagnosis; hospital or other reporting source; disease type, site, morphology, and stage at diagnosis; tumor behavior (in-situ, benign, malignant, borderline, and uncertain); basis of diagnosis (pathology report, clinical only, imaging devices, or based on death certificates only); and mortality data. Mortality data is continuously updated by reports from the Ministry for Internal Affairs.
A database will be constructed for the proposed study from the Clalit Health Services data warehouse. Available are demographic data and vital status from the Israeli Ministry of Interior Affairs and the National Insurance Institute (Social Security); drug dispensing data from pharmacies; and medical data such as laboratory tests, imaging results, diagnoses and clinical data (blood pressure, weight, height, etc.) from Clalit Health Services in-house facilities, as well as from outside suppliers. The data are accessible at the member level, and linked to all registries, including the Cancer Registry, by means of a national identification number, a unique identifier possessed by all Israeli citizens. Data will be organized according to the BI (business intelligence) schema or infrastructure and are extracted using BI programs. Extraction is executed via the BO (business object) and SQL server management studio programs, depending on the complexity and diversity of the data requested from the data warehouse. BI analyst specialists will conduct the extraction and merge the data.
The most conservative estimates yield at least 175,000 incident cases of cancer in the total cohort. Thus, adequate power is expected for a large number of cancer types and treatment regimens. The sample size for analysis will decrease as the number of treatment switches increases and as certain combinations become rare. Nevertheless, power will be adequate to find relatively small differences based even on subsets of 500–1000 events for a particular treatment or treatment combination. Specifically, we will have over 80% power (based on two-tailed testing at the 0.05 level) to detect a hazard ratio of 1.4 for two subsets of 6000 persons each.
For our first specific aim, we will compare, after adjustment, all and site-specific cancer rates between individuals with and without diabetes. For the second aim, we will investigate whether metabolic control, as indicated by HbA1c and blood glucose levels, is related to cancer risk. Third, we will evaluate differences in outcomes that associate with the use of one or a combination of glucose-lowering treatments. In all of these analyses we will stratify persons with diabetes by those who were already diagnosed with diabetes at study entry (prevalent diabetes), and those who were diagnosed during follow-up (incident diabetes).
Preliminary data analysis will employ standard methods, starting with calculation of person years and age-gender standardized cancer incidence rates for all sites and selected sites in non-diabetics, and in prevalent and incident diabetics without prevalent cancer. These methods will also be used to investigate potential confounding variables such as baseline smoking status and body mass index. In our analyses of Specific Aims 2 and 3, we will then proceed to use Cox regression with time-dependent treatment variables. For Specific Aim 2, the model will include current and past levels of Hba1c and blood glucose, investigating the level above selected thresholds cumulated over the past n years, where n may be between 1 and 5. For Specific Aim 3 the model will include current and previous treatments. For example, we will investigate exposure to a specific medication when expressed by the cumulative dose or by the cumulative dose over the past 1–5 years. We will investigate exposure to multiple medications, first by models that postulate independent multiplicative effects on cancer incidence rates (no interaction model); and if there are sufficient data, we will proceed to examine interactions between medications on cancer incidence. Due to data limitations, it is unlikely that anything more than pairwise interactions will be able to be examined satisfactorily. The analyses of glucose-lowering medications will take into account the clinical factors (HbA1c, fasting glucose) that may have triggered the change. We will, for example, use the methods of Walker, White, and Babiker
] to better evaluate how HbA1c triggers treatment changes. This may or may not be important for cancer incidence analyses, depending on whether HbA1c and fasting plasma glucose are or are not related to cancer incidence (see Specific Aim 2). In analyses of Specific Aims 2 and 3, we will also be cautious of the possibility of reverse causation, whereby early physiological changes before the diagnosis of cancer may cause deterioration in glucose control and subsequent change in diabetes treatment. We will therefore investigate whether early peaks in cancer incidence rates follow a change in treatment, and will examine model results when all cancer diagnoses are moved backwards in time by a set period of 2 years, 1 year, or 6 months.
As outlined above, in specific analyses, multiple independent sensitivity analyses will be used more generally to test our model assumptions and the robustness of our results. We will look at varying time intervals for the piecewise exponential models, stratification for departures from proportional hazards, interactions and time-dependent models including lagged predictors. Propensity scores will be used as an alternative to direct confounder adjustment, to control for confounding by indication, and to check the robustness of conclusions. We will also explore results obtained by applying the recently described “prior event rate ratio” method of Tannen, Weiner, and Xie
], to the analysis of glucose-lowering medications. Though this method will only be applicable to incident cases of diabetes, its effectiveness in controlling for confounding by indication has been reported.
Cancer prevalent cases will be analyzed separately and will have separate adjustments performed, including separate propensity scoring. This is to overcome potential confounding factors associated with cancer treatments and their residual effects. Furthermore, the risk of second primary neoplasms has not been analyzed in diabetes patients, and might have a different pattern of associations with the disease itself and with the use of glucose-lowering therapies than does the risk of first primaries.
The primary analysis will be performed using PROC PHREG of SAS version 9.2, which allows both Cox-proportional hazards models with time-dependent variables and piecewise exponential models. The piecewise exponential model divides time-at-risk into a set of pre-specified intervals with constant baseline hazard in each interval; the use of multiple intervals allows the hazard to vary with time. Clinically important time dependent covariates for the regression models include age, cholesterol (total, LDL, and HDL), concomitant medications (such as statins, beta-blockers, angiotensin converting enzyme inhibitors, and hormone replacement therapy), smoking, body mass index, blood pressure, co-morbidities (such as cardiovascular disease), and reproductive history. For each continuous variable, we will investigate appropriate transformations and interactions that might improve goodness-of-fit. The proportional hazard assumption will be evaluated using log-log survival plots, examination of Schoenfeld residuals, and testing of interactions between variables and time.
While we can not describe, or even anticipate, every contingency for this comprehensive project, we are confident that the very well qualified team has the expertise to evaluate the data.
This study was approved by the ethics committees of Sheba Medical Center, Tel Hashomer and Clalit Health Services, Israel.