We examined the relationship of surgery with survival among the 7,086 individuals 30 years or older with metastatic kidney cancer in the SEER dataset diagnosed between 1988 and 2002 [24
]. We excluded 2,836 due to missing tumor size, surgery, or mortality information. The majority (2,562) were excluded due to missing tumor size information. Our final cohort for analysis was 4,250 individuals. displays characteristics of the sample stratified by surgical and missing grade status.
Patients who underwent partial, complete, or radical nephrectomy or nephrectomy NOS were classified as undergoing surgery. Patients who underwent only biopsy, exploratory surgery, palliative bypass, or had unknown status were classified as having not received surgery. There were 1,837 (43%) individuals who did not have surgery and 2,413 (57%) who did have surgery. A total of 2,875 (68%) individuals had missing grade data, with 83% of those who did not have surgery missing grade information compared with 52% of those who did have surgery. Grade was defined as specified in the SEER database except for the collapsing of the two highest grades: well differentiated (grade 1), moderately differentiated (grade 2), and poorly differentiated, undifferentiated, or anaplastic (which we label as grade 3).
Demographic information is listed in . After stratifying by missing grade, those who had surgery were younger on average and had larger tumors. In addition, in the stratified data, the surgery group had fewer women, more married individuals, and fewer non-whites than the non-surgery group. For these analyses, we grouped a small number of individuals (fewer than 25) with missing race information in the non-white category, and 93 individuals with missing marital status into the non-married category. Those with missing grade information were more likely to have been diagnosed at earlier time periods (e.g. pre-1992) than those with complete grade information. Among those who had grade data available, grade seemed to be worse in those who had surgery than those who did not. Although grade appears to be worse in patients who underwent surgery than those who did not, this may be due to the fact that patients who underwent surgery had more adequate pathologic specimens, allowing for assessment of tumor grade.
In we present Kaplan-Meier curves of mortality outcomes by stage across the 3 grades. We see that those who did not have surgery in the sample had substantially worse outcomes on average than those who did have surgery. This may be due in large part to selection bias, since less healthy patients will likely be considered poor surgical candidates.
Kaplan-Meier Survival Estimates by Grade in the Observed Data
In order to control for baseline differences between the groups, we examined the effect of surgery in a Weibull proportional hazards regression using the complete data as presented in . We included age, sex, tumor size, race, marital status, and year of diagnosis as covariates in the model in addition to surgery, grade, and their interaction. We used restricted cubic spline basis functions [25
] to account for the terms for age (2 interior knots), size (2 interior knots), and year of diagnosis (1 interior knot). We entered the covariates similarly into the missing data model for the weighted analysis. Restricted cubic splines allow for nonlinear effects in models. In , the hazard ratio of the effect of surgery on mortality among those with grade 1 tumors (the main effect term of surgery) is 0.31 (95% CI 0.19-0.52). The hazard ratio increases with increasing grade. In the regression, the interaction terms of surgery with grade are not significant suggesting that grade does not moderate the protective effect of surgery on outcomes. None of the coefficients of the potentially confounding variables was significantly associated with mortality in the model. In the case of the continuous variables, however, the magnitude and interpretability of the effects is dependent on the scale of the spline basis functions.
Hazard ratios from Weibull model fit on those with complete data (n=1,375)
Fitting the naive regression is similar to assuming that τ1 and τ2 equal zero. In , we present the hazard ratio over various assumptions about the relationship of surgery and its interaction with grade in the missing data model. For completeness of presentation, we report analyses over a range of sensitivity analysis parameters that includes extreme assumptions (values of τ1 and τ2 that when exponentiated include odds ratios of 1/15 to 15). Later in this Section, we will propose a region in which we believe the truth likely to exist.
Hazard ratio (contour lines) of surgical effect in the Weibull proportional hazards regressions for various values of the sensitivity parameters, τ1 (the main effect term) and τ2 (the interaction term).
In the sensitivity analyses, we entered covariates into the missing data model as described in the Weibull model, but also included censoring status, time until death or censoring (using a restricted cubic spline with 1 interior knot), and interactions between censoring status and all covariates and an interaction between surgical status and time until death or censoring. In this way, we sought to create a flexible missing data model.
When τ2 = 0, we have the no interaction model in which the relationship of grade with missingness does not vary by surgical status. In the no interaction case, when τ1 = 0, the hazard ratio is approximately 0.3 to 0.35 regardless of grade, which is consistent with the naive Weibull regression presented in . When τ1 is negative, the effect of surgery becomes more protective for those with grade 2 tumors. In the no-interaction case, a negative value of τ1 means that the log odds of having missing data decreases as grade increases. At extreme negative values of τ1, this would indicate that much of the missing data consists of better differentiated tumors, as discussed in Section 4.1. Intuitively, incorrect negative specification of τ1 would result in the incorrect classification of many grade 3 tumors as grade 1 or grade 2 tumors. Since there are fewer grade 1 and grade 2 tumors with observed data, this could explain why varying τ1 has a larger impact on grade 1 and grade 2 estimates than grade 3 estimates. As τ1 increases but τ2 remains equal to zero, more of the missing tumors are assumed to be higher grades. At extreme positive values of τ1, few of the missing tumors are assumed to be grade 1 tumors, as discussed in Section 4.1, and hence the surgical effect among grade 1 tumors would be less affected by the sensitivity analysis. This indeed seems to be the case in regardless of τ2.
As τ2 decreases, missing grades in those with surgery are assumed to be better grades relative to those who did not have surgery. This could affect estimates of the surgical effect as the surgical benefit would be due in part to confounding by grade rather than a true surgical effect. Such an effect could explain why the benefit of surgery decreases dramatically, to the point of non-statistical significance and a hazard ratio greater than 0.60, in those with grade 1 tumors when τ2 is at extreme negative values. Such confounding could also explain the strengthening of the association when τ2 is positive but τ1 is negative.
Overall, the magnitude of the protective surgical effect is generally consistent across the three tumor grades regardless of the sensitivity parameter. The magnitude of the effect generally ranges between 0.3 and 0.5 except in grade 1 tumors when τ2 is extremely negative and τ1 is positive.
In terms of statistical significance, the majority of the estimates of the hazard ratio effect are statistically significant. For those with grade 2 and 3 tumors, the p-value is less than 0.01 in all cases. This is likely due to the larger number of individuals with grade 2 and 3 tumors in the completely observed data which reduces the standard errors of the estimates. For those with grade 1 tumors, the significance holds in all cases except for extreme negative values of τ2 and slightly positive values of τ1 in which the p-value is above 0.10. Again, the lack of statistical significance could be due to confounding of differential classification of missing grade data between surgical groups as discussed above.
In , we present figures representing the main effect and interaction terms from the Weibull proportional hazards used to generate . In , while there is some evidence of an interaction between surgery and grade in the region in which the surgical effect loses statistical significance in those with grade 1 disease, the p-values for the interaction terms do not fall below 0.05.
Figure 3 Hazard ratios (contour lines) representing the main effect and interaction terms in the Weibull proportional hazards regressions for various values of the sensitivity parameters, τ1 (the main effect term) and τ2 (the interaction term). (more ...)
Of note is that we do not know the true values of τ1 and τ2. However, it is possible to postulate reasonable bounds for τ1 and τ2. We believe that in this population of patients with metastatic disease, those with worse grade (grade 3) tumors are more likely to be sicker on presentation. Therefore, they may be less likely to undergo aggressive procedures such as cytoreductive nephrectomy and have sufficient tissue to properly identify grade. This suggests that τ1 would be positive (τ1> 0). However, it is likely that such a trend is not as pronounced among those who have undergone surgery as an adequate surgical specimen would reduce the association between grade and missing data. This would result in τ2 attenuating the relationship of grade with missing data (−τ1 < τ2 < 0). This would suggest that the true values of τ1 and τ2 reside in a triangle in the lower right quadrants of the graphs in and . The location of the truth in such a region would suggest that surgery is associated with better mortality outcomes (p < 0.05 in all cases). However, the absolute magnitude of the protective effects would not be as great as implied by the naive analysis.