The SEER–Medicare data linkage, a collaboration of the National Cancer Institute and the Center for Medicare and Medicaid Services, was created by linking two large population-based data sources: the SEER cancer registry system and the Medicare enrolment and claim files. Detailed description of the database is available elsewhere (Warren et al, 2002
). Briefly, it includes all incident cancer cases recorded by SEER registries, currently encompassing ~25% of the US population, plus a 5% random sample of all Medicare beneficiaries residing in SEER areas to serve as population-based controls. The Medicare part of the linkage includes data from inpatient claims since 1986 and from all other types of claims (outpatient, physician, home health, and hospice services) since 1991. The database does not include claims for Medicare beneficiaries during enrolment in a health maintenance organisation (HMO). Clinical diagnoses are coded using the International Classification of Diseases, 9th Revision (ICD-9) codes (US Public Health Service, 1996
From the SEER database, we included all women with incident primary invasive adenocarcinoma of the breast (ICD-O-3 C500-C509; histology codes 8140, 8201, 8211, 8480, 8500, 8501, 8503, 8504, 8520, 8521, 8522, 8523, 8524, 8530, 8541, and 8543; and behaviour code 3) who were diagnosed in 1993–2002, with no previous cancer of any type; BC cases diagnosed only at autopsy or by death certificate were excluded.
Female controls who were alive and cancer free as of 1 July of the calendar year of case selection were selected at random, with replacement, from the 5% sample of the Medicare beneficiaries who resided in SEER areas. To ensure availability of claims data, cases and controls had to be 67–99 years of age, selected in 1993 or later, and have at least 12 months of simultaneous part A and B coverage (and no HMO coverage) before the selection date. Controls were frequency matched in 1
1 ratio to cases according to the calendar year of diagnosis and age in three categories (67–74, 75–84, and 85+). Women who became BC cases could be selected as controls until they were diagnosed with the cancer.
Participants were considered to have SARDs if they had at least one inpatient or two outpatient/physician claims (with a minimum interval of 30 days between claims) for any of the following diagnoses: RA (ICD-9 714.0, 714.1, 714.2, 714.3, 714.81, or V82.1), SLE (710.0), systemic sclerosis (710.1), Sjogren's syndrome (710.2), or dermatomyositis (710.3). Women who met the definition for more than one condition were included in a separate category, multiple SARDs. History of SARDs was ascertained from Medicare claim files, up to 12 months before case–control selection date.
Variables assessed as potential confounders of associations between BC and SARDs included age, race, socio-economic status, region of residence, history of mammography, number of physician visits 12–24 months before selection, and earlier use of immunosuppressive medications. We used the 1990 census median annual household income in the study participants' zip code of residence as a proxy measure of individuals' socioeconomic status. The regions of residence, based on the location of the SEER registry, were categorised as western, northeastern, north-central, and southern (). History of mammography was defined as any mammography claim recorded from 12 months to a maximum of 48 months before case–control selection. History of immunosuppressive therapy was defined as any Medicare claim for the following between age 65 years and 12 months before the case–control selection: cyclophosphamide, methotrexate, chlorambucil, azathioprine, cyclosporine, mycophenolate mofetil, sirolimus, tacrolimus, prednisone, prednisolone, methyl prednisolone, and immunosuppressive medication not otherwise specified.
Characteristics of breast cancer cases (overall and by ER status) and controls
We used unconditional logistic regression to calculate odds ratios (ORs) and 95% confidence intervals (CIs) for the association between BC risk and SARDs (overall and for each condition). We computed the variance of the OR estimates using a robust variance estimator (Zeger and Liang, 1986
) to adjust for the correlations between observations when the same participant was selected in different calendar years.
In evaluating BC risk by ER status, we excluded cases with unknown ER expression. We used polytomous logistic regression to estimate ORs and 95% CIs with a robust variance estimator comparing ER-positive cases and ER-negative cases with the cancer-free controls (Anderson et al, 2008
). A Wald test with one degree of freedom was used to test for the heterogeneity in the estimated regression coefficient for specific SARDs between ER-positive and ER-negative cases.
All final models were adjusted for age and year of selection (matching variables), race, region of residence, median-household income by zip code of residence, history of mammography, and history of immunosuppressive therapy as defined earlier. We also stratified all the models by year of selection in three groups (1993–95, 1996–99, and 2000–02) to investigate whether the estimated associations were affected by secular trends in coding accuracy, mammography screening, or standard of care. In the final models that evaluated overall BC risk associated with SARDs (overall and by condition), we added two interaction terms, age (
<75) and race (whites vs
others), to test whether these associations are modified by age or race.