We conducted three case-control-family studies using identical questionnaires with what would be generally considered acceptable participation from population-based cases, population controls and unaffected sisters of cases (as an additional control group). We found that, despite similar and reasonably high participation at each stage of sampling for each group, the two control groups differed for some key variables including highest educational attainment, age at menarche and, for parous women, number of births, age at first birth and the interval between last birth and reference age.
When we attempted to replicate results for some well-established risk factors for breast cancer, we found that several key associations were not evident when cases were compared with population controls, but were observed when cases were compared with sister controls. This could have been due to differential participation across the groups.
Contrary to established associations, and consistent with results from our previous Australian study of women aged <40 years,1
we found that population controls were more highly educated, less likely to be married and less likely to be foreign born than cases. These differences remained both when women from the previous Australian study were excluded and when all women aged <40 years were excluded. As these observations are for women up to 69 years of age, they suggest that these factors are not differentially associated with early-onset disease, but rather that they are related to study participation. They also did not appear to depend on the proportion of eligible women in the different control groups who participated in each study as the results were consistent across studies. The present observations of differential participation are consistent with findings from a number of published studies.15–17
It appears that the use of population controls results in a selection bias affecting the estimation of relative risks associated with some key reproductive risk factors, even after adjusting for educational attainment, marital status and country of birth. This bias does not appear to exist when cases are compared with sister controls.
We observed a reduced risk associated with increasing age at menarche only when using sister controls, even though age at menarche was correlated between sister pairs (r
0.25). All three studies found that the sisters had a later age at menarche than did the population controls. Similarly, when cases were compared with sister controls, pregnancy-related variables were strongly associated with breast cancer risk in the expected direction based on the literature, with a decreasing risk associated with increasing number of births, increasing time since last birth and earlier age at first birth, although the latter association was not maintained when all these correlated factors were included in a multivariable analysis. None of these established associations were observed when population controls were used. Population controls were not more likely to be parous than sister controls, but those who were parous had fewer children than parous sister controls, had their first child at a later age and their last child more recently. Again, this was consistent across all studies.
These findings for pregnancy-related variables are consistent with there being greater participation by population controls with a higher socio-economic status. Educational attainment can be considered a surrogate indicator of socio-economic status. The recruited population controls had a socio-economic status and reproductive history more similar to that of the breast cancer cases than did sister controls.
It is perhaps not so clear why this would influence inference in relation to age at menarche. One possible explanation is that the majority of our study participants went through puberty prior to the 1960
s when nutritional status was positively associated with parental income and socio-economic status, which in turn could have influenced age at menarche via the early attainment of body mass.18
One established association we replicated when using population controls, but not sister controls, was an increasing risk with increasing height. It might be that, whereas there is no differential participation with respect to height for population controls, the moderately strong correlation in height (and familial factors associated with height) between cases and sisters (r
0.47) meant that the latter analysis had inadequate power to detect a real association.
We observed some established associations when using population controls and when using sister controls. Later age at menopause was associated with an increased risk of the same magnitude when using either control group. For parous women, longer lifetime duration of breastfeeding was associated with decreased risk. This association was numerically stronger when population controls were used, possibly because of correlations in breastfeeding behaviour within families.
We have compared analytically the use of sister controls versus population controls for breast cancer case-control association studies with respect to selected demographic, reproductive and other factors known to be associated with risk. Further studies will be required to assess whether our findings are generalizable to other exposures or lifestyle factors, or to men. Within-family correlations in exposures, especially those related to early life, mean that studies using sister controls would have less statistical power to detect associations than those using the same number of population controls. This could be more important for genetic association studies or studies investigating early-life risk factors. This concept is referred to as ‘statistical inefficiency’, but it does not necessarily argue against using a sibling control design and also does not mean the sibling control design is inefficient in terms of time and money spent on resource collection, a point that is not necessarily well understood.
We found that the cost of recruiting a sister control was substantially less than that of recruiting an independent population control. Population control recruitment requires the organization of a separate and appropriate recruitment strategy and process, which might not even be possible for some studies. On the other hand, sibling control recruitment can be readily merged with case recruitment, as we have demonstrated. Once we were given permission by the case to approach their sibling, the sibling was approached using identical study protocols in terms of informed consent, administration of questionnaires and bio-specimen collection. In this way, recruiting sibling controls could be a much more cost-effective strategy, despite the typically small within-family correlations that might influence the calculation of standard errors for risk estimates derived from sibship comparisons of some covariates. Further, although not all cases have sisters, or give permission to contact their sisters, we have pooled all cases and all sister controls so that no cases were excluded from the analysis. Other sources of controls could be considered, such as other relatives, friends or neighbours, although each has disadvantages.19
Our study suggests that, for association studies, the recruitment of population controls that are representative in terms of all relevant demographic and reproductive variables might no longer be possible, even when participation appears to be acceptable. At least with respect to women, it appears there is differential participation by socio-economic status and hence correlated risk factors. This could be expected to become even more pronounced for future case-control studies, should levels of participation in population-based epidemiologic research continue to decline, particularly for population controls. Given this growing and widespread problem, we suggest that recruitment of siblings as controls might be a valid (i.e. unbiased) and cost-effective alternative.