As we have seen, the EB design presents some potential advantages over the conventional design. However, these advantages must be contrasted with potential disadvantages and questions of feasibility of executing the EB design. One of the major feasibility issues in implementing the EB design is that patients will be randomized to a clinician in the A group or the B group, and so two clinicians must be on hand to administer the randomized assignment to a given patient. This is obviously more difficult for acute care interventions such as emergency surgery, but it is less of an issue for elective surgery. However, even for elective procedures, patients may be unwilling to travel to different facilities to receive their intervention, so the EB design is perhaps most feasible in the context of group practices or large hospitals where teams of clinicians are available, having a mixture of expertise. Even if multiple clinicians are available in the same institution, one still has the practical difficulty that the patient will have consulted one clinician at an early stage, but then may be randomized to an intervention requiring the expertise of a different clinician later on. Explanation of this complication to the patients may be difficult and a potential disincentive to participate in the randomized trial, compared with the patient who deals with only one clinician in the conventional design.
Quite apart from the direct impact of expertise on patient outcomes in the way we have discussed, there are other potential advantages of the EB design that may enhance its estimated treatment benefit. First, the EB design recognizes clinician preferences, and by definition involves them in treating patients exclusively with a technique with which they are familiar. In turn, this is likely to lead to less frequent procedural crossovers, compared with the situation, where for 50 per cent of patients clinicians will be using the technique with which they are less familiar, less expert, and therefore possibly less comfortable. Additionally, differential co-interventions may be reduced when surgeons are all using the procedure in which they are invested.
It is well known that crossovers are likely to attenuate the difference between randomized treatment groups, leading to a loss of power in the intention to treat analysis, which is the preferred analysis to avoid possible selection biases associated with alternative analyses such as the per protocol or as treated in [15
]. Even a relatively small proportion of patients crossing over to the non-randomized intervention can lead to a substantial reduction in the expected difference in outcomes between randomized groups.
An additional advantage of the EB design is that it may lead to improved recruitment of clinicians and patients. Clinicians may be more willing to participate in a study if they know that they will always be using their preferred technique. Patients may be more willing to enrol in a randomized trial if they can be assured that their treating clinician will be an expert in the technique to which they will be randomized. Both of these design features are thought to be stronger from the ethical perspective.
For reasons of generalizability, we would recommend that the EB design be used primarily in situations where moderate or large numbers of surgeons and centres can be involved. In contrast, EB studies with very small numbers of surgeons and centres would lead to results whose generalizability would remain uncertain. However, the reality is that most surgeons tend to select a specific approach to manage a given clinical problem, and therefore the conventional design is unlikely to overcome the obvious problems of generalizability of a small EB trial. For instance, consider the smallest possible case involving only two surgeons. In such a situation, it is probable either that both surgeons would have greater expertise in one of the interventions, thus biasing the trial toward that intervention, or that one surgeon has expertise in one technique and the other surgeon has expertise in the other technique. In this situation, it is unlikely that a conventional design would provide a more valid result than the EB design. Similar concerns would affect studies with more but only very limited numbers of surgeons.
In our example of the SPRINT study, the EB design was disadvantaged both with respect to a larger standard error and a smaller treatment effect, compared with the conventional design. However, the latter is not always true, and in other situations the EB design might yield a larger treatment effect, in particular when the expertise advantage is associated more strongly with the intervention that has inherently better outcomes. In such a case, there would be a trade-off between the loss of statistical efficiency through a larger standard error with an enhanced estimated treatment effect, and it is therefore possible that the EB design might on balance be more powerful than the conventional design.
In our example, we calculated the expected treatment effect in the EB design by combining the expected advantages associated with surgeon expertise with the inherent effects of the treatment themselves. As such, these estimates might be thought of in the sense of treatment efficacy, because each treatment would be used in relatively ideal circumstances, with an expert surgeon. It is interesting to note that while the relative advantage of the reamed nail compared with the unreamed nail was eroded when expertise was taken into account, nevertheless all of the expertise effects were positive. In other words, the expected outcomes for patients receiving either intervention would be expected to improve when surgery is carried out by an expert in the relevant technique. Thus, while the EB design would show a smaller overall advantage of the reamed nail, it could be argued that the EB design is preferred from the ethical viewpoint, because expected outcomes for patients in both groups would be improved compared with the conventional design.
As noted earlier, the adjustment for expertise effects may produce an increase or decrease in the unadjusted estimated treatment effect. In the scenario where the expertise adjustment results in an increased treatment effect, the efficacy interpretation of study results would lead one to recommend additional training in the superior intervention, so that more clinicians may acquire the expertise necessary to use it.
A further scenario is where the adjustment for expertise causes the advantage of the unadjusted estimate of benefit for treatment A to be completely negated (or even exceeded). This type of result would indicate that additional training to gain expertise in B could potentially improve patient outcomes to the same level as those for patients with treatment A (even when delivered by clinicians expert in A), or even to exceed outcomes for patients on A. This would amount to a qualitative reversal in the interpretation of the analyses adjusted or unadjusted for expertise.
A counter-argument to making the adjustment for expertise would be that treatment effects should be thought of in terms of effectiveness, as opposed to efficacy, and should therefore remain unadjusted for expertise effects. This position is reasonable in situations where either treatment A or B could be recommended for particular individual patients, depending on other considerations related to their clinical circumstances, or because of patient values and preferences. Thus, there may be good reasons why clinicians might decide to administer one or the other of two alternative interventions, even though they might not be equally expert in both. Here, the treatment estimate, being unadjusted for expertise effects, would represent the benefit anticipated in routine clinical practice, and as administered by experts or non-experts, as the case may be.
We feel that on balance, for situations such as those indicated through our SPRINT example, the acquisition of relevant expertise would be highly desirable. This is because the outcomes of patients may be expected to improve with either intervention, if it is delivered by an expert on the technique in question. If the treatment effect is additionally enhanced through the expertise adjustment, that argues even more strongly for additional training to acquire expertise in the superior technique. If the expertise adjustment leads to similar expected outcomes (or even a reversal of the unadjusted results) one would recommend additional training to acquire expertise in the technique that is inferior in the unadjusted analysis.
An issue for the analysis is how to represent the expertise effects. In our tibial fracture example, we characterized surgeons as being expert or not on each of the comparison surgical techniques, and both were taken into account in our model. An alternative would be to associate the expertise effect only with the particular technique in use for a given patient. Thus, for example, while a given surgeon might be an expert in both techniques, one could decide that only his experience with the particular randomized intervention at hand would be relevant. Thus, for example, experts in technique A would have no advantage over non-experts in A if their patients had been randomized to intervention B.
We defined expertise as having performed more than the median number of operations. In greater generality, we might represent expertise in more detail, for instance according to the actual number of previous operations carried out, during residency training, since completion of training, or in some period immediately before the start of the randomized trial. More detailed control of the expertise effect would be desirable in situations where it is known that a considerable amount of experience is required before no further improvements to patient outcomes might be anticipated, such as was mentioned for hernial surgery. This type of an example suggests that a run-in period or other requirement that clinicians have been involved within some minimal number of cases before the start of the study may not resolve the problem of differential expertise bias in the conventional design.
In our description of the EB design, we have focused on the notion that patients are randomized to a surgeon who has expertise in one technique or the other. We also mentioned the possibility of a hybrid design, to include surgeons who are expert in both techniques, and whose patients would therefore be randomized to interventions as usual. It is possible that surgeons may have had experience with both interventions, but they might state that they do not have a preference for either. Here the crucial question would be whether they have greater expertise in one or the other of the interventions. If they do, then the EB design is still relevant. If a surgeon has truly no preference and equal expertise in both techniques, then they could either participate in a conventional design or the hybrid design, depending on the beliefs and expertise of the other participating surgeons.
There is the possibility that some surgeons may be ‘intrinsically innovative’ and relatively willing to acquire new skills with alternative surgical techniques. In contrast, traditional surgeons may not be as willing to change. It is clear that we would not wish to force the traditional surgeons into attempting a new technique with which they were uncomfortable, but such surgeons can nevertheless participate in an EB trial design on the conventional technique. If the trial results suggest better patient outcomes for the innovative technique, one would then need to consider the necessary training that would be required in order to permit the ‘traditional’ surgeons to incorporate the new technique into their clinical practice.
In summary, we feel that the ED design offers some attractive features, both to participating clinicians and to patients. The EB design does offer a number of potential advantages, including the minimization of differential expertise bias, reducing the impact of having unblinded interventions, reducing treatment crossovers, and potentially greater ethical strength. We recognize that disadvantages include the confounding of surgeons for treatment, and potentially limited generalizability. However, the conventional design may also have limited generalizability if clinicians are required to use interventions for which they have less experience. We primarily advocate the EB design for common surgeries (and other clinical interventions) for which it is felt that all competent clinicians could develop expertise, if the study results suggested that this was appropriate. However, more experience is needed to assess its practical feasibility, and to evaluate its statistical properties relative to the conventional randomized trial design. The framework given in this paper provides a first step in this direction.