The main objective of this article was to provide practical guidance to researchers on how best to approach therapist effects in the context of a typical, small RCT for eating disorders. Of the two main approaches for analyzing therapist effects (fixed vs. random), the fixed effects approach may be most feasible in small trials, although it has limitations in terms of generalizability and modeling of certain aspects of the data. Power using the fixed effects approach equates to power to detect differences among the specific therapists in the trial. This is not the same as power to estimate therapist effects in the population of therapists, regarding the therapists in the trial as a sample; this would require a random effects analysis. A properly designed random effects study would be the gold standard to estimate therapist effects, but it would require a large sample of therapists (at least five, preferably more) and therapists would need to be purposively sampled to enable inference to some specified population.

In the simulations, power to detect therapist effects increased as the size of the therapist effect increased and as the number of therapists decreased. The former finding is to be expected, although this study makes a contribution by quantifying the power in specific scenarios. The latter finding also makes sense—the smaller the number of therapists, the greater the number of patients within each (because the total patient sample size was held constant); thus, the more information available to make conclusions about the effectiveness of the therapist.

The simulation results imply that if therapist effects are analyzed as fixed, it is preferable to minimize the number of therapists, to maximize the information about the effectiveness of each therapist. Does this mean that the optimal number of therapists is two? We believe that the answer is no, based on three considerations. First, there are practical reasons to involve more than two therapists. Therapists may drop out during the course of a trial. It is useful to have a pool of trained backup therapists available if needed. If there are only two therapists involved in a trial, and one therapist stops participating, the trial timeframe is likely to be extended (because all treatment falls to a single therapist) and the trial itself might be jeopardized. For this reason alone, it is worthwhile to consider involving at least three or four therapists.

Second, there is little value in detecting small therapist effects. For nearly every treatment, there will probably be at least a very small therapist effect—it is unrealistic to think that therapists will be exactly equal in treatment effectiveness or that patients, even randomly assigned, will have the exact same response to therapy. However, there is no reason to set up a trial and analysis to detect therapist effects that are so small as to be practically unimportant. The important consideration is ability to detect large, practically important therapist effects. The simulation study presented above provided example definitions of small, medium, and large therapist effects. The results suggest that in a typical small RCT on eating disorders, three or four therapists can be employed with reasonable power (≥70%) to detect large therapist effects.

Third, although one may not be able to conclusively generalize results to a larger population of therapists based on a small trial with a nonrandom sample of therapists and a fixed effects analysis, using a greater number of therapists provides a better sense of intertherapist variation in administration of the treatment, which may be helpful for future research, including the design of larger trials designed specifically to estimate therapist effects. Variation among three or four therapists may provide a better sense of the therapist effect (at least qualitatively) than does variation among two therapists.

In this article, the therapist effect is defined as the intertherapist difference in outcomes for a single treatment—specifically, the difference between therapists in their patients’ BE reduction between baseline and post-treatment, holding constant the type of treatment administered. In RCTs where each therapist administers multiple types of treatment, the therapist effect may also be defined in terms of the therapist-by-treatment interaction, that is, the difference in outcomes for a therapist administering treatment A versus treatment B. These are different ways of defining therapist effects. Both definitions are informative and intuitive.

How should therapists be selected for the typical, small RCT for eating disorders? The ideal may be to select three or four skilled therapists and provide them with extensive, standardized training. Then the results will indicate whether the treatment can be effectively and consistently administered by a group of skilled, well-trained therapists. In other words, the results will indicate whether the treatment can be effective under relatively favorable conditions. If there is a treatment effect but no therapist effect, this suggests that the treatment can be effectively and consistently administered under relatively favorable conditions. On the other hand, if there is a therapist effect (i.e., the therapists differ in their treatment outcomes), this suggests that the treatment does not consistently succeed under such conditions. If each therapist has a relatively large sample of patients, then having a sample of three or four skilled, well-trained therapists provides more certainty about the nature of therapist differences than does a sample of two therapists. If there is a therapist effect with two therapists, it is uncertain whether the treatment was ineffective or whether one of the therapists was ineffective. However, if there are three or four therapists and the majority show favorable treatment outcomes, this more strongly suggests that the treatment can be effective and the outlying therapist failed to administer the treatment effectively.

One might also view an initial, relatively small trial as a pilot study with respect to therapist effects. Leon et al.^{27} recommend using pilot studies to test the feasibility of various aspects of a trial. Therapist effects in an initial trial might be viewed as a pilot indicating the feasibility of therapist selection and training methods. If a treatment effect is present but there is no therapist effect, this suggests that the therapist selection and training methods were adequate. On the other hand, if therapists differ in treatment outcomes, this points to inadequate therapist selection and/or training methods that need to be improved before conducting a larger trial.

The forgoing discussion assumes that patients are randomly assigned to therapists. Patients vary in the severity of their condition, so it is possible that a spurious therapist effect could result from assignment of sicker patients to one of the therapists, by chance. This is a possibility in any trial, but especially in small trials.

Although the objective of this article was to provide practical advice for designing and analyzing RCTs on eating disorders, most of the points discussed are applicable to psychotherapy RCTs in other fields as well. However, the power analysis results, here indicating that three or four therapists will be sufficient to detect relatively large therapist effects, will not necessarily be the same for RCTs for other disorders. Many RCTs for BN/BED have used similar designs (pre/post, treatment vs. control) and measures (e.g., the Eating Disorders Examination), so it is possible to describe a typical RCT in BN/BED and to base power calculations on the results of such trials. In RCTs on other disorders, there may be less consistency in measurement methodologies and the primary outcome measures may have a different distribution; therefore, a different number of therapists might be recommended.

Importantly, the simulation results reported here assumed a particular RCT design, namely a single treatment administered by multiple therapists, compared with no-treatment control. This design has been frequently used in the eating disorders field (e.g., Refs. ^{14}^{–}^{17}), but other designs have also been used. For example, some RCTs have compared multiple treatments, either administered by the same therapists (treatments crossed with therapists, e.g., Refs. ^{28}^{–}^{30}) or with different therapists administering different treatments (treatments nested within therapists, e.g., Refs. ^{31}^{–}^{33}). The recommended number of therapists will likely vary based on the RCT design among other factors. These other designs are beyond the scope of this article. Practical recommendations for approaching therapist effects using these designs in eating disorders research would be a worthwhile topic for future work.

The simulation results reported here are also applicable to RCT designs where a single treatment administered by multiple therapists is compared with another treatment that does not involve therapists, for example, medication-only (e.g., Ref. ^{34}) or pure self-help (e.g., Ref. ^{35}). In such designs, the assumed treatment main effect may be reduced, but the therapist effect (i.e., the difference between therapists in the therapist-administered treatment condition) will be approximately the same as reported here.

In summary, this study points to several practical suggestions for investigators conducting typical, small RCTs (≤200 total patients) for eating disorders (BN/BED): (1) use a fixed effects approach to analyze therapist effects; (2) regard the therapist effects analysis as conclusive with respect to the specific therapists in the trial, but as inconclusive with respect to the population of therapists; and (3) employing about three or four therapists is likely to be practical, while providing sufficient power to detect large therapist effects and insight into the nature of therapist differences (if there are any).