|Home | About | Journals | Submit | Contact Us | Français|
To estimate the average survival effects of breast conserving surgery plus irradiation relative to mastectomy for marginal stage II breast cancer patients in Iowa from 1989–1994.
Secondary linked Iowa SEER Cancer Registry—Iowa Hospital Association discharge abstract data for women in Iowa with stage II breast cancer from 1989–1994.
Observational instrumental variables (IV) analysis.
Women with stage II breast cancer from the Iowa SEER Cancer Registry 1989–1994 who received all of their inpatient care in Iowa were linked with their respective hospital discharge abstracts.
Breast conserving surgery plus irradiation decreased survival relative to mastectomy for marginal stage II breast cancer patients in Iowa during the early 1990s. In this study marginal patients were those whose surgery choices were affected by differences in area treatment rates and access to radiation facilities.
If marginal patients are representative of patients whose treatment choices would be affected by changes in treatment rates, an increase in the breast conserving surgery plus irradiation rate for stage II early stage breast cancer patients would have decreased survival in Iowa during the early 1990s. Further research with newer data and broader samples is needed to make more current and specific assessments.
The initial treatment decision for patients with early stage breast cancer (stages I, IIa, and IIb) is the surgical approach for local tumor control—mastectomy (MAS) or breast conserving surgery plus irradiation (BCSI). In its 1991 Consensus Statement on this decision, the National Institutes of Health (NIH) recommended BCSI for most women with early stage breast cancer (ESBC) based on randomized controlled trial (RCT) evidence showing equivalent survival benefits of BCSI and MAS and the cosmetic superiority of BCSI (National Institutes of Health Consensus Conference 1991). The NIH statement suggested that patients should be educated on the choices and they should make decisions consistent with their preferences. The BCSI rates increased in the United States afterward, but many ESBC patients still do not receive BCSI, and BCSI rates vary well beyond what would be expected based solely on differences in patient preferences (Benedict et al. 2001; Du et al. 2000; Guadagnoli et al. 1998; Keleman et al. 2001; Morrow et al. 2001; Riley et al. 1999).
The slow and varied rate of BCSI diffusion in the United States puzzled researchers and policymakers (Morrow et al. 2001; Adams 2001, Keating et al. 2001). This phenomenon has been attributed to a lack of provider knowledge of the RCT evidence. Interventions to increase the BCSI rate have been suggested (Keleman et al. 2001; Benedict et al. 2001; Nold et al. 2000; Stafford et al. 1998). An alternative explanation for the slow and varied BCSI diffusion rate, though, necessitates further research prior to initiating efforts to increase BCSI rates. Providers may have been aware of the NIH Consensus Statement and its supporting RCT evidence and they may have had beliefs consistent with the NIH for patients clinically similar to those in the RCTs. However, providers may have differed with the NIH in whether the RCT evidence could be extrapolated to ESBC patients that were clinically different from the average patients in the RCTs. Providers may have believed that MAS offered survival benefits for many ESBC patients and BCSI rate variation may have resulted from regional differences in these beliefs. If providers sorted patients between MAS and BCSI appropriately, an increase in the BCSI rate would have worsened survival for patients whose surgery choices were affected (i.e., ESBC patients who would have otherwise received MAS). Therefore, before initiating efforts to increase BCSI rates, policymakers need estimates of the relative survival benefits of BCSI and MAS for the set of patients who would switch from MAS to BCSI as a result.
In this study we assume that the set of patients whose surgery choice would be more likely affected by a BCSI rate increase would be those patients for whom the RCT evidence supporting the survival equivalence between BCSI and MAS is the least certain. The RCT evidence supports the survival equivalence of BCSI and MAS for the “average” ESBC patient in each RCT (Veronesi et al. 1990; Arriagada et al. 1996; Jacobson et al. 1995; Fisher et al. 1995; van Dongen et al. 2000). However, if the survival benefits of BCSI and MAS are heterogeneous across ESBC patients, it is not clear whether this evidence can be generalized to patients with clinical circumstances differing from the average patients in the RCTs. Patients with ESBC are classified with stage I disease if they have localized tumors less than 2 cm with no lymph node involvement; stage IIa if they have either a localized tumor less than 2 cm with positive lymph node metastasis on the same side, or a tumor between 2 and 5 cm with no lymph node involvement; and stage IIb if they have either a localized tumor between 2 and 5 cm with positive lymph node metastasis on the same side, or a tumor greater than 5 cm with no lymph node involvement. Two RCTs included only stage I patients and provided compelling evidence that MAS offers no survival benefit for these patients (Veronesi et al. 1990; Arriagada et al. 1996). The evidence for stage II patients is less certain. The remaining studies contained a mix of stage I and stage II patients and each estimated a single average treatment effect (Jacobson et al. 1995; Fisher et al. 1995; van Dongen et al. 2000). Because no studies contained only stage II patients, it is less certain whether these estimates can be generalized to stage II patients. In fact, in the study with the most stage II patients (mainly stage IIa) tumor size and nodal involvement were related to an increased risk of local recurrence for BCSI patients but not for MAS patients (van Dongen et al. 1992).
As a result of this uncertainty, we focused this study on ESBC patients with stage II disease. We apply instrumental variable (IV) methods to obtain estimates of BCSI survival effects relative to MAS for stage II ESBC patients in Iowa from 1989–1994. Instrumental variable methods group using measured instrumental variables (instruments) that have the following two properties: (1) they are related to treatment choice and (2) they are assumed related to outcomes only through their effect on treatment choice (no direct effect on outcome and no indirect effect on outcome through unmeasured confounders). Instrumental variable estimates are obtained by exploiting the treatment variation across patient groups defined by the instruments. The inferences made from IV estimates are conditional on the assumption that instruments essentially ex post randomize unmeasured confounders across patient groups. Given this assumption, IV methods yield consistent estimates of treatment effects for marginal patients (Angrist, Imbens, and Rubin 1996; Imbens and Angrist 1994; McClellan, McNeil, and Newhouse 1994; Harris and Remler 1998; Brooks, McClellan, and Wong 2000; McClellan and Newhouse 2000) that are defined as the subset of patients whose treatment choices varied with the instrument. In addition, previous IV research assumed that IV estimates can be generalized to the set of patients affected by treatment rate changes as marginal patients are also theorized to come from the set of patients for whom the best treatment is least certain (McClellan, McNeil, and Newhouse 1994; Harris and Remler 1998; Brooks et al. 2000a). However, marginal patients associated with a single instrument may not be representative of all patients potentially affected by a treatment rate change. We assess the validity of this assumption by using different instrument specifications. The nature of IV estimation limits our ability to generalize the findings beyond the set of marginal patients defined by our instruments, but this empirical scenario provides an opportunity to demonstrate the applicability of IV methods to policy-based research questions.
Previous research attributed variation in ESBC surgical choice across geographic areas to regional differences in “surgical philosophy” or “surgeon propensity” (Guadagnoli et al. 1998; Sainsbury et al. 1995; Foster, Farwell, and Costanza 1995; Answini et al. 2001; Iscoe et al. 1994; Mandelblatt et al. 2001). We theorize that there are regionally distinct surgical philosophies that lead to regional differences in BCSI rates. Empirically, we measure surgical philosophy in the area surrounding each stage II patient as the BCSI percentage of ESBC surgeries for all other ESBC patients (stage I and II) in a 50-mile radius around each patient's residence in their diagnosis year. Our IV approach groups ESBC stage II patients based on these rates and exploits the BCSI rate differences for stage II patients across these groups to estimate the average survival benefits of BCSI relative to MAS for marginal stage II patients.
Our theory relies on the notion that differences in patient access to providers with different surgical philosophies at the time of diagnosis lead to different surgery choices for marginal ESBC patients. We theorize that patients initially seek care from providers closer to their residences and the surgical philosophies of these providers weigh in the surgical decisions made by ESBC patients (even if surgery is eventually performed outside the patients local area). With respect to unmeasured confounders, there is little theoretical basis linking patient residence decisions made prior to an ESBC diagnosis with unmeasured patient severity found after diagnosis. Without more data collection it is impossible to validate directly whether instruments are unrelated to unmeasured confounders, and spurious correlations between instruments and unmeasured confounders may remain that bias our estimates. For example, area BCSI rates may be correlated with patient socioeconomic status or the general access of patients to health care. We mitigated this risk to our estimates by specifying area poverty percentages and the distance to the nearest hospital in our IV analysis.
In addition, the marginal patients defined by a single instrument may not fully describe the set of patients affected by an increase in BCSI rates. Alternative instruments may affect the surgical choices of distinct subsets of patients. Accordingly, if surgical effects are heterogeneous, variation in IV estimates across instruments suggests that the set of marginal patients varies with the instrument and that models specifying more than one instrument may yield estimates more representative of the average stage II ESBC patient affected by an increase in BCSI rates. To evaluate this possibility, we also grouped patients by the distance from their residence to the nearest radiation treatment center based on the theory that travel cost affects surgery choice. Patients with residences further from radiation treatment centers at the time of diagnosis have been less likely to receive BCSI (Hadley and Mitchell 1997). We specified models with each instrument separately and with both instruments.
Our data came from the Iowa Surveillance, Epidemiology, and End Results (SEER) Program Cancer Registry, the Iowa Hospital Association (IHA) inpatient discharge abstract files, and the Census Bureau's 1990 Zip Code Summary Tape File 3B. The Iowa SEER Registry provided the universe of first primary ESBC patients (stages I, IIa, and IIb) diagnosed during 1989–1994 that had either MAS or BCSI indicated within their first course of treatment (n=8,143). To collect comorbidity and payer data we excluded patients that received inpatient care outside of Iowa (n=848) and linked the remaining patients (n=7,296) to their respective inpatient discharge abstracts from the Iowa Hospital Association database. A previous paper provides a summary of the linkage approach and linkage validation statistics (Brooks et al. 2000). Inpatient discharge abstracts were linked to 84 percent of stage I patients and 89 percent percent of stage II patients. Of the linked patients, 2,905 were either stage IIa or IIb.
For each of the 2,905 stage II patients, Iowa SEER data were used to create binary variables defining treatment choice (MAS or BCSI), survival (alive 1 year, 2 years, 3 years, 4 years after diagnosis), cancer stage (T,N,M), cancer grade, tumor location, and age (younger than 50, 50–64, 65–69, 70–74, 75–79, 80–84, older than 84). We used the discharge abstracts to specify binary variables for payer (Medicare, Medicaid, Blue Cross/Blue Shield, other private, other government, no third party), and a Charlson comorbidity index based on the diagnoses within their initial inpatient discharge after diagnosis (Charlson et al. 1987). We created binary variables for the poverty percentage in the patient's zip code using data from the 1990 zip code summary file (#7, 7–10,10–13,13–20,>20), and distance from residence to nearest hospital (#2.83, 2.83–9, 9–15, >15) in the patient's year of diagnosis using hospital zip code data from the Iowa Hospital Association. Distances between zip code centroids were used to calculate the BCSI percentage of ESBC surgeries for all other ESBC patients (stage I and II) in a 50-mile radius around each patient's residence in their diagnosis year. The distance from each patient to the nearest radiation treatment center was calculated using the miles from each patients residence zip code centroid to the centroid of the zip code containing the nearest radiation treatment center in the year the patient was diagnosed that we obtained from Iowa SEER.
We employed a nonparametric two-stage least squares (2SLS) variant of IV estimation that uses a minimum of distributional assumptions. This approach was used in previous IV research in healthcare (McClellan, McNeil, and Newhouse 1994; McClellan and Newhouse 1997; Brooks, McClellan, and Wong 2000) and in questions of labor supply (Angrist and Evans 1998, Angrist 2001). Consistent estimates are yielded by 2SLS regardless of the underlying error distributions, whereas alternative estimators that rely on error term distributional assumptions are inconsistent if the assumptions are wrong (Angrist 2001). In the first stage of the 2SLS approach we used ordinary least squares (OLS) to estimate the following model of surgery choice:
where Ri is the surgical choice for patient i (1=BCSI, 0=MAS), Xi is a vector of binary variables containing measured confounding variables (diagnosis year, age groups, tumor size, nodal status, tumor grade, tumor location, comorbiditity index, insurance status, distance to nearest hospital, zip code poverty level), ci is the effect of unmeasured confounders that affect both surgery choice and patient survival, and ei is the net impact of the set of unmeasured factors that affect surgery choice only. Ai is a vector of binary variables that group patients based on each patient's instrument value. The distribution of each instrument was assessed across the sample, and cutoff values for each instrument were determined to divide patients into groups of similar size. A Chow F-test (Chow 1960) of whether Ai describes a significant portion of the variation in Ri (i.e., whether the estimates of a2 are simultaneously equal to zero) provides a natural test of whether the instruments affect treatment choice.
In the second stage of 2SLS, we estimated survival models using four different survival measures (1 year, 2 years, 3 years, 4 years). Each survival model was specified as follows:
where Si is binary variable equal to 1 if patient i survives beyond a certain time interval past their diagnosis (1 year, 2 years, 3 years, 4 years), Xi, Ri, and ci are defined as in equation (1), and ui is the set of unmeasured factors that affect patient survival and not surgery choice. The average survival effect of BCSI relative to MAS is represented by b2. Estimating equation (2) using OLS or another analysis-of-variance method (ANOVA) will yield a biased estimate of b2 if ci is not equal to zero.
When estimating equation (2), 2SLS avoids this bias by replacing the actual surgery variable in equation (2)Ri with the predicted BCSI surgery probability from equation (1) for each patient ^i. Bias is avoided because Xi and Ai are the only sources of variation for ^i from equation (1), and because Xi is also specified in equation (2), the only variation in ^i that is used to estimate b2 in equation (2) is the variation in ^i that is attributable to Ai. Because Ai is assumed to be unrelated to ci, the IV estimate of b2 provides a consistent estimate of the change in the survival rate from a one-unit change in the BCSI rate that is only generalizable to the group of patients whose treatment choices were affected by the instruments—the marginal patients.
If Ai is specified using a single dummy variable, the sample is divided into two groups and b2 is estimated using the surgery rate differences between the two groups (e.g., Earle et al. 2001). If Ai is specified with several dummy variables that divide the sample into several groups, the empirical model is overidentified and b2 is estimated as the weighted average of the many two-group estimates that are available. When an empirical model is overidentified, a Hausman statistic (Hausman 1983) for overidentifying restrictions can be used to test the null hypothesis that the exclusion of Ai from the outcome equation was appropriate (i.e., Ai affects Si only through Ri). A large value of the Hausman statistic rejects the null hypothesis.
Little theoretical guidance exists to specify the number of binary variables (Ai) for each instrument. Adding groups increases the number of two-group comparisons used in estimation, but lowers the number of patients in each group and increases the risk of introducing spurious relationships that violate the IV assumptions (e.g., membership in a particular group may be perfectly correlated with an unmeasured confounder). We assessed the robustness of our findings by varying the number of patient groups for each instrument (2, 4, 8, and 12 groups).
Table 1 provides univariate comparisons of BCSI rates and measured confounders by grouping method to assess the properties of our instruments. A comparison of the third (MAS) and fourth (BCSI) columns shows that ESBC stage II patients receiving BCSI had characteristics related to lower survival risk regardless of surgery choice (smaller tumors, younger, fewer comorbidities). Columns 4 and 5 compare patients grouped by whether the BSCI rates in the area around their residences were lower or greater than the median, respectively. Columns 6 and 7 compare patients by whether they lived farther or nearer from a radiation treatment center, respectively. Stage II patients were less likely to receive BCSI if they lived in an area with lower BCSI rates across all ESBC patients (7.3 percent versus 12.1 percent) or were farther from a radiation treatment facility (6.5 percent versus 13.0 percent). For each instrument there were no distinct differences in tumor size or stage between groups. Significant differences in age and tumor grade were observed. Tumor grade differences appear attributable to differences in the percentage of patients with an unknown tumor grade (9) and Iowa SEER Registry officials suggest these probably reflect regional differences in reporting practices and not differences in disease severity. Differences in patient age reflect the pockets of rural elderly in Iowa. We control directly for tumor grade, age, and distance to hospitals in our IV analysis, but our results are conditional on the assumption that differences in measured covariates are not symptomatic of differences in unmeasured confounders across instrument groups.
Table 2 contains the estimates of the Chow F-statistics testing the statistical significance of the instruments in equation (1) for 2-, 4-, 8-, and 12-instrument group models using the area BCSI rate. Local-area overall BCSI rates describe a statistically significant portion of the variation in surgery choice for stage II patients across model specifications. Stage II ESBC patients who lived in areas with higher overall BCSI rates were more likely to receive BCSI. Tumor size, tumor location, and lymph node involvement also explained a statistically significant portion of the variation in surgery choice. Stage II ESBC patients with smaller tumors, negative lymph nodes, and with tumors located in the lower-inner quadrant, the upper-outer quadrant, and the axillary tail were more likely to obtain BCSI. Additionally, younger patients and patients living in higher poverty areas were more likely to obtain BCSI.
Table 3 contains IV and OLS estimates of the effect of BCSI relative to MAS on patient survival. The interpretation of these estimates varies with the estimation approach. Row 1 contains unadjusted OLS estimates (estimates of equation  without specifying Xi), and row 2 contains adjusted OLS estimates (estimates of equation  with Xi as described in the table). Using unadjusted OLS, patients receiving BCSI appeared to have a higher probability of surviving three and four years after diagnosis. Adjusted OLS estimates directly controlled for measured confounders and no survival advantage from BCSI remained. Rows 3 though 6 contain IV estimates using the area BCSI rate instrument at various grouping levels. These estimates are consistently negative and often statistically significant, implying survival disadvantages from BCSI. For example, the 8-group, two-year-survival model suggests that increasing the BCSI rate by 2 percentage points among marginal patients (e.g., from 7.3 to 9.3 using the BCSI rates in Table 1) would have decreased two-year survival for that group by 1 percentage point. The estimates remained fairly consistent across the 2-, 4-, and 8-group specifications, but the 12-group estimates fell in absolute value. Violation of IV assumptions may be the source of inconsistency for the 12-group estimates. The Hausman test statistics (not shown) were all statistically insignificant for the 2-, 4-, and 8-group models. The Hausman test statistics were greatest for the 12-group specifications and the statistic was statistically significant in the four-year survival model, which suggests that our sample size limited the number of groups we can use without introducing spurious relationships that violate IV assumptions.
Instrumental variable estimates were consistently negative across instrument specifications. Estimates obtained using the area BCSI rate instrument were generally larger in absolute value and significantly different from zero more often than estimates found using the radiation distance instrument (rows 7–10). The IV estimates found specifying both instruments (rows 11–14) appear as averages between the estimates found using the individual instruments. These results suggest that the sets of marginal patients differ between instruments. To investigate this in a post hoc analysis we divided our sample into patients younger than age 65 and patients older than age 64 and reestimated equation (1) for both subsamples using both instruments at the 4-group level. Differences in marginal patients were confirmed as we found that radiation distance had a relatively greater impact on surgery choice for patients younger than age 65; patient and area BCSI rates had a relatively greater impact on patients older than age 64.
The inability of the 1991 NIH Consensus Statement to substantially increase the rate of BCSI has often been attributed to a lack of provider knowledge. Alternatively, BCSI may not have been more widely and consistently adopted because many providers may not have shared NIH's beliefs that RCT evidence could be extrapolated to ESBC patients with more extensive disease. If it is unclear whether RCT results can be extrapolated to ESBC patients with more extensive disease, justifying a BCSI rate increase needs estimates of the survival impacts for the patients whose surgery choices would be affected by a rate increase. We used IV methods to estimate the survival effects of BCSI relative to MAS for stage II ESBC patients from Iowa whose surgery decisions varied with the practice styles of local providers and access to radiation treatment. We argued that these estimates are naturally representative of patients whose surgery choice would be affected by an increase in BCSI rates.
Our IV estimates are based on the surgery variation revealed by our instruments and their consistency is conditional on the assumption that our instruments are not systematically related to unmeasured confounding variables. We supported this assumption in Table 1 by comparing the distribution of measured confounders across patients grouped by surgery choice and their instrument values. Patients with measured clinical characteristics associated with less-extensive disease were more likely to obtain BCSI. If unmeasured confounders share the same relationship with surgery choice as measured confounders (patients with unmeasured confounder values related to lower survival risk are more likely to receive BCSI), then OLS estimates of BCSI survival adjusted for measured confounders will remain biased in favor of BCSI. When patients were grouped by instruments, the distributions of tumor size and comorbidities appeared similar across groups. Patient age and tumor grade varied with the instruments. We directly controlled for these and other measured confounders in our IV analysis, but our results are conditional on the assumption these differences are not symptomatic of differences in unmeasured confounders across instrument groups. Fuller assumption validation requires additional data. We next estimated a surgery choice model that revealed that stage II patients with more extensive disease were less likely to obtain BCSI, and patients grouped by our instruments had statistically significant different BCSI rates.
Using the BCSI variation revealed by our instruments in IV analysis, we found consistently negative one-, two-, three-, and four-year average survival impacts of BCSI relative to MAS for marginal stage II ESBC patients—patients whose surgery choices were affected by variation in our instruments. If marginal patients are closely aligned to the set of patients whose surgery choices would have been changed with an increase in BCSI rates, these results suggest that an increase would have decreased survival rates. Previous authors (e.g., Harris and Remler 1998) justify this alignment by assuming that the set of patients whose treatment choices are affected by instruments and those patients whose treatment choices change with an increase in treatment rates are similar—those for whom optimal treatment is the least certain. While our IV estimates were consistently negative, their magnitude and significance varied by instrument. This suggests that patients defined as “marginal” vary with the instrument and that the set of patients potentially affected by an increase in BCSI rates is broader than the marginal patients from a single instrument. As a result, our combined instrument estimates may be more closely aligned to the average survival effects that would have resulted from an effort to increase BCSI rates.
Readers should be careful in generalizing our estimates beyond the stage II ESBC patients in Iowa during 1989–1994 whose treatment choices were associated with our instruments. If treatment effects are heterogeneous across ESBC patients, it would be risky to generalize our estimates to stage II patients in other states or other time periods that had markedly different BCSI rates because characteristics of the patients defined as marginal will differ. In addition, our results cannot distinguish whether BCSI was inappropriate treatment for the marginal patients in our sample or whether the BCSI techniques that were used during this period in Iowa were inappropriate for the marginal patients. This is important to distinguish because as BCSI rates increased during the 1990s, it is likely that the set of marginal patients in the late 1990s would have had more extensive disease and, all else being equal, would be more likely to have survival disadvantages from BCSI. However, increased use of BCSI also may have been associated with better skills, which may have reduced the survival advantages of MAS in the later time period for these patients.
This research demonstrates the value of IV methods to help policymakers assess the impacts associated with proposed treatment rate changes. To use IV estimates for this purpose, policymakers must assume that the patients whose treatment choices are affected by instruments are similar to the patients whose treatment choices are affected by rate changes. If this assumption holds here, our results suggest that during the early 1990s in Iowa, BCSI may have been overused, and efforts to increase BCSI rates at this time in Iowa may have been inappropriate. Application of IV methods to more current data from across the country with larger samples that allows for subgroup analysis will provide policymakers with the information required to make more current and specific assessments.
Nancy Keating, seminar participants at the University of Alabama-Birmingham, University of Kentucky, Purdue University, and University of Iowa, and two anonymous referees provided helpful suggestions. Chuck Lynch and Diana Wagner from the Iowa SEER were extremely helpful. Jane Ritho provided research assistance. Any remaining errors are the authors'.
The National Cancer Institute under special studies grant no. NO1-PC-85063-20 provided resources used for data collection. The interpretation and reporting of these data are the sole responsibility of the authors and do not necessarily reflect the position or policy of the government or reviewers and no official endorsement should be inferred. All errors are the responsibility of the authors.