PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Med Decis Making. Author manuscript; available in PMC 2018 January 1.
Published in final edited form as:
PMCID: PMC5130606
NIHMSID: NIHMS790058

The Physician Recommendation Coding System (PhyReCS): A Reliable and Valid Method to Quantify the Strength of Physician Recommendations During Clinical Encounters

Abstract

Background

Physicians’ recommendations affect patients’ treatment choices. However, most research relies on physicians’ or patients’ retrospective reports of recommendations, which offer a limited perspective and have limitations such as recall bias.

Objective

To develop a reliable and valid method to measure the strength of physician recommendations using direct observation of clinical encounters.

Methods

Clinical encounters (n = 257) were recorded as part of a larger study of prostate cancer decision making. We used an iterative process to create the 5-point Physician Recommendation Coding System (PhyReCS). To determine reliability, research assistants double-coded 50 transcripts. To establish content validity, we used one-way ANOVAs to determine whether relative treatment recommendation scores differed as a function of which treatment patients received. To establish concurrent validity, we examined whether patients’ perceived treatment recommendations matched our coded recommendations.

Results

The PhyReCS was highly reliable (Krippendorf’s alpha =. 89, 95% CI [.86, .91]). The average relative treatment recommendation score for each treatment was higher for individuals who received that particular treatment. For example, the average relative surgery recommendation score was higher for individuals who received surgery versus radiation (mean difference = .98, SE = .18, p < .001) or active surveillance (mean difference = 1.10, SE = .14, p < .001). Patients’ perceived recommendations matched coded recommendations 81% of the time.

Conclusion

The PhyReCS is a reliable and valid way to capture the strength of physician recommendations. We believe that the PhyReCS would be helpful for other researchers who wish to study physician recommendations, an important part of patient decision making.

There has been an increasing interest in empowering patients as informed consumers of healthcare goods and services.1 As informed consumers, patients often must choose between multiple treatment options. For example, in early stage prostate cancer, patients must choose whether to receive surgery, radiation, or active surveillance. Each of these treatment options is associated with a unique profile of risks and benefits, and therefore there is not a single right treatment option for all patients.2 Shared decision making is considered by many to be the “pinnacle of patient-centered care,” a process by which patients and physicians work together to choose the best treatment based on both medical factors and patients’ individual preferences.3 As part of this process, physicians may provide patients with recommendations. It is vital to be able to accurately capture these recommendations. Even within the paradigm of shared decision making, physicians’ recommendations strongly impact patients’ treatment choices, potentially even more so than patients’ cancer severity, age, or anxiety.46

However, current research on physician recommendations has several limitations. Physician recommendations are frequently treated as binary, in which a physician either does or does not recommend a single treatment option.4 In reality, however, treatment recommendations are often more nuanced, and physicians can provide recommendations of varying strength for multiple treatments. Additionally, most studies have relied on patient reports of physician recommendations, which are subject to recall bias.7,8 Furthermore, motivated cognition may lead patients to misremember physician recommendations, such that their reported recommendation matches their treatment choice rather than accurately reflecting their conversation with the physician.9

In order to address these limitations, we developed the Physician Recommendation Coding System (PhyReCS), which captures the strength of physician recommendations during appointments within the context of early stage prostate cancer. The PhyReCS addresses the aforementioned limitations in the following ways: it is a continuous (rather than binary) measure, has the flexibility to capture multiple nuanced recommendations, and avoids problems associated with relying on patients’ retrospective reports of recommendations. In this article, we provide an in-depth explanation of the PhyReCS, measure its reliability, and assess its validity.

Methods

Setting and Study Population

Appointments (n=257) were recorded and transcribed as part of a larger trial in which men undergoing prostate biopsies were randomized to receive either a standard or low-literacy prostate cancer treatment decision aid prior to choosing a treatment for their early stage prostate cancer.10 The type of decision aid did not influence our measures of interest;* therefore, it is not discussed further in this article. Appointments were recorded from 2008–2012 at four geographically-dispersed, academically-affiliated Veterans Affairs medical centers. During each appointment, the patient and physician discussed treatment options for the patient’s newly-diagnosed early stage (low or intermediate risk) prostate cancer. Patient and physician demographics are listed in Table 1. There were 47 unique physicians in our study, most of whom were residents or fellows. On average, each physician was recorded in 5.31 clinical encounters (SD = 3.77).

Table 1
Patient and physician demographics

Scale Development

We identified a subset of transcripts using maximum variation purposeful sampling techniques, in which we selected transcripts that differed on variables that we expected to influence physician recommendations (e.g. age, Gleason Score).11 We then used an iterative process to develop a 5-point Physician Recommendation Scale to capture how physicians portrayed each treatment option during the clinical appointment as a whole. We defined the boundaries of each recommendation score through repeated application and discussion.

For each treatment option (surgery, radiation, and active surveillance), recommendations were coded as follows: +2 (−2) indicated that the physician made a strong recommendation for (against) the treatment, +1 (−1) indicated that the physician made a mild recommendation for (against) the treatment, and 0 indicated that the physician recommended neither for nor against the treatment. If the physician did not mention a treatment, this was coded as “not discussed;” for the purpose of these analyses, this was treated as equivalent to a strong recommendation against the treatment option (−2), because such an omission essentially indicated that the physician did not think it was even worth mentioning the option. Such omissions occurred relatively infrequently (surgery: n = 3; radiation: n = 3; active surveillance: n = 11). Thus, for each appointment, coders assigned a recommendation score for each of the three primary treatment options (surgery, radiation, and active surveillance). Importantly, recommendation scores were independent such that a recommendation against a particular treatment did not automatically translate into a recommendation for another treatment.

Although recommendation scores were global judgments that considered the appointment in its entirety, there were often key statements that captured the sentiment of physicians’ feelings towards a particular treatment option. Table 2 provides examples of these types of statements. As noted above, however, keep in mind that final recommendation scores were global scores based on the entire appointment rather than any single statement in isolation; therefore, we provide an example of how recommendation scores evolved over the course of an appointment in Table 3.

Table 2
For Each Recommendation Score, Examples of Key Statements That Capture the General Sentiment of Physicians’ Feelings Towards Each Treatment Option
Table 3
Example of how physician recommendation scores evolved over the course of an appointment in response to specific physician comments. The “Evolving Rec Score” column reflects how the coder modified each recommendation score in response ...

Coder Training and Scale Application

The lead researcher trained five research assistants (RAs) using the finalized codebook (available in online appendix): each RA received approximately 30 hours of guided practice over a period of 2–3 weeks until they demonstrated a thorough understanding of the PhyReCS. RAs then double-coded a random subset of 50 previously-unseen transcripts, which we used to calculate reliability. Discrepancies were resolved via team discussion. Given the high reliability (see below), it was appropriate for RAs to single-code the remainder of encounters (n = 207). All coding was finished within 3 weeks, minimizing the possibility of coder drift. RAs coded the transcripts in the development set at the end of the coding period to minimize the chance of carryover from the training period. Example transcripts available upon request.

To determine reliability, we calculated Krippendorf’s alpha for the recommendation scores for the subset of transcripts that were double coded (n = 50). Krippendorf’s alpha offers advantages over other ratings of inter-rater reliability such as inter-class correlation coefficient or weighted kappa, and it can be used “regardless of the number of observers, levels of measurement, sample sizes, and presence or absence of missing data.”12,13 We treated the scale as an interval variable and used Hayes’ 2013 SPSS macro to calculate Krippendorf’s alpha using 5000 bias-corrected bootstrap samples to determine 95% confidence intervals.12 A Krippendorf’s alpha value of 0 indicates that the reliability was no better than would be expected by chance whereas a value of 1 indicates perfect reliability; values of 0.67–0.8 are considered indicative of acceptable reliability and values above 0.8 indicate excellent reliability.13

Treatment Received

Patient treatment choice was determined via chart review six months after the recorded appointment (data available for 216 individuals).§ Five individuals received a treatment other than active surveillance, radiation, or surgery (e.g. hormone therapy); these individuals were not included in analyses that examined which treatment patients received. Therefore, for analyses involving patient treatment choice, n = 211.

To examine the construct validity of the PhyReCS, we tested whether recommendation scores differed as a function of treatment received. Given that physicians’ recommendations are a strong determinant of patients’ treatment choices,4 we should find that, on average, physicians’ coded recommendation scores should be higher for the treatment the patient received versus the non-chosen treatment options. We calculated a relative recommendation score for each treatment option, which was equal to that treatment recommendation score minus the average of the recommendation scores for the other two treatments. For example, Active Surveillance relative = [Active Surveillance raw] – [(Surgery raw + Radiation raw) / 2]. We then used a series of three one-way ANOVAs to examine whether the average relative recommendation score for each treatment differed as a function of treatment received.** For example, we tested if the average relative surgery recommendation score differed for individuals who received surgery versus radiation versus active surveillance. If Levene’s test indicated that we violated the assumption of homogeneity of variance, we used a Brown-Forsythe correction for the omnibus F-test and a Tamhane test for pairwise comparisons. Otherwise, we used a Bonferroni correction for pairwise comparisons.

Perceived versus coded recommendations

Patients’ perceptions of their physicians’ recommendations were determined via phone interview conducted by a professional survey company approximately 7–10 days after the recorded appointment (data available for 205 patients). Patients were asked, “Did your physician provide a recommendation?” If they indicated yes, they were then asked, “What was the recommendation?” with the answer choices of surgery, external beam radiation, brachytherapy,†† watchful waiting/active surveillance, and other. Patients who answered “other” (n = 2) were excluded from analysis; thus, for analyses involving patients’ perceived recommendations, n = 203.

To examine the concurrent validity of the PhyReCS, we examined the concordance between physicians’ recommendations as perceived by patients (“perceived recommendations”) and physicians’ recommendations as determined by coders using the PhyReCS (“coded recommendations”). Given that both patients and coders are experiencing the same conversation, if the PhyReCS is valid, there should be relatively high concordance between patients’ perceived recommendations and our coded recommendations. However, given that patients’ perceptions may be influenced by factors other than the objective occurrences during the appointment, we would not be surprised to see some differences between patients’ perceptions and our coded recommendations.

We classified the perceived versus coded recommendation as a “match” if the treatment that the patient perceived as recommended also received the highest PhyReCS recommendation score. On the other hand, we classified the perceived versus coded recommendation as a “mismatch” if the treatment that the patient perceived as recommended did not receive the highest PhyReCS recommendation score. For patients who perceived that the physician did not provide a recommendation, we classified the perceived versus coded recommendation as a “match” if more than one treatment received the highest recommendation score and as a “mismatch” in all other cases.

Funding

All funding agreements ensured the authors’ independence in designing the study, interpreting the data, writing, and publishing the report.

Human subjects approval

This study was approved by the Institutional Review Boards at each of the participating sites; written informed consent was obtained from all patients and physicians.

Results

Reliability

Each of the 50 double-coded transcripts included three recommendation scores (one for surgery, radiation, and active surveillance); thus, we had 150 recommendation scores to assess the reliability of our scoring system. The Krippendorf’s alpha for all treatments was .89 (95% CI [.86, .91]), indicating excellent reliability. The Krippendorf’s alpha for the individual treatments was as follows: active surveillance = .94 (95% CI [.92, .95]); surgery =. 87 (95% CI [.82, .91]); and radiation = .64 (95% CI [.49, .80]), all of which indicate acceptable to excellent reliability. Of note, the lower reliability for the radiation scores was due to less variation in the scores; the Krippendorf’s alpha formula takes this into account and therefore reliability goes down more with each individual discrepancy because there is a higher likelihood that matches occurred by chance. Table 4 displays the observed versus expected coincidence matrices for the scores given by the two RAs coding each encounter. There were 29 total discrepancies out of 150 scores assigned. One coder was responsible for 15 of these discrepancies; she received remedial training before being allowed to continue with coding. Four of the discrepancies were significant, in which coders assigned scores that were 2 points apart on the scale. Three of these discrepancies were due to a misunderstanding of the coding rules by the coder who received remedial training. The remaining 25 discrepancies involved minor disagreements, in which coders assigned scores that differed by only one point on the scale. Importantly, coders never disagreed about whether a physician recommended for versus against a treatment option; in other words, there were no discrepancies that involved a negative versus positive score.

Table 4
For transcripts in the test set (n = 50), observed versus expected coincidence matrices for recommendation scores (n = 150) assigned by Coder 1 versus Coder 2a on the same transcript. Frequencies on the diagonal represent perfect matches.

Relative recommendation scores differ as a function of treatment received

A series of three separate ANOVAs revealed that, for all treatments, the relative treatment recommendation score was higher for individuals who received that treatment versus the other two treatments (Fig. 1). The relative surgery recommendation score differed as a function of treatment received [F(2,208) = 53.01, p < .001]. Specifically, the relative surgery recommendation score was higher for individuals who received surgery versus radiation [M surgery = 1.69, SE = .14 vs. M radiation = .52, SE = .20; mean difference = 1.17, SE = .24, p < .001] and surgery versus active surveillance [M surgery = 1.69, SE = .14 vs. M active surveillance = =−0.27, SE = .13; mean difference = 1.96, SE = .19, p < .001]. The relative radiation recommendation score also differed as a function of treatment received [Brown-Forsythe F(2,85.78) = 23.22, p < .001]. Specifically, the relative radiation recommendation score was higher for individuals who received radiation versus surgery [M radiation = 1.08, SE = .22 vs. M surgery = .30, SE = .08; mean difference = .78, SE = .23, p = .004] and radiation versus active surveillance [M radiation = 1.08, SE = .22 vs. M active surveillance = −.25, SE = .09; mean difference = 1.33, SE = .23, p < .001]. Finally, the relative active surveillance recommendation score differed as a function of treatment received [Brown-Forsythe F (2,187.83) = 105.81, p < .001]. Specifically, the relative active surveillance recommendation score was higher for individuals who received active surveillance versus surgery [M active surveillance = .52, SE = .14 vs. M surgery = −1.99, SE = .15; mean difference = 2.51, SE = .19, p < .001] and active surveillance versus radiation [M active surveillance = .52, SE = .14 vs. M radiation = −1.61, SE = .19; mean difference = 2.12, SE = .22, p < .001].

Fig. 1
Average relative recommendation score for each treatment as a function of treatment received. For each treatment, the average recommendation score for the treatment received was higher than the average recommendation score for the other two treatments. ...

Comparison of perceived versus coded recommendations

There was a high level of concordance between patients’ perceived recommendations and the coded recommendations determined using the PhyReCS. Overall, the perceived and coded recommendation matched in 81% (164/203) of cases. Patients perceived that their physicians recommended active surveillance in 52 appointments; the perceived and coded recommendation matched in 80% (42/52) of these appointments. Patients perceived that their physicians recommended surgery in 70 appointments; the perceived and coded recommendation matched in 91% (64/70) of these appointments. Patients perceived that their physicians recommended radiation in 26 appointments; the perceived and coded recommendation matched in 85% (22/26) of these appointments. Patients perceived that their physicians provided no recommendation in 55 appointments; the perceived and coded recommendation matched in 65% (36/55) of these appointments. The perceived and coded recommendation mismatched in 19% of cases (39/203). In 44% of these cases (17/39), patients perceived no recommendation but the PhyReCS determined that the physician did recommend a particular treatment. Notably, the patient received the PhyReCS-recommended treatment in 65% of these cases (11/17).

Discussion

In this paper, we demonstrated that the Physician Recommendation Coding System (PhyReCS) is a reliable and valid way to quantify the strength of physician recommendations during clinical appointments in the context of early stage prostate cancer. We showed that the scale could be applied with high reliability. We established construct validity by showing that the average relative recommendation score for each treatment was higher for individuals who received that treatment versus the other two treatment options. In addition, we demonstrated concurrent validity by showing that there was a high level of concordance between patients’ perceived recommendations and coded recommendations as determined by the PhyReCS.

We believe that the PhyReCS would be helpful for other researchers who wish to study physician recommendations, and it is flexible enough to be adapted to many clinical settings. For example, patients with early stage breast cancer must choose whether to receive breast-conserving therapy (lumpectomy plus radiation) or mastectomy. Like early stage prostate cancer, the “right” treatment choice depends upon patient preference in addition to medical factors.14 Patient-physician conversations about these treatment options are complex and the physician may recommend multiple treatment options with varying strength. There are, of course, important differences between these clinical situations, including the gender of patients; however, we believe that with proper validation the PhyReCS could help to better understand the connection between physician recommendations, patient preference, and treatment choice in clinical settings besides prostate cancer.

Given the centrality of physicians’ recommendations in the medical decision-making process, the PhyReCS will also allow researchers to answer other interesting and important questions. For example, future research could examine when and why there is discordance between coded treatment recommendations and patients’ perceived recommendations, potentially providing insights into cognitive processes such as motivated cognition and recall bias. Are there circumstances in which patients are more (vs. less) motivated to perceive treatment recommendations as consistent with their chosen treatment option? In addition, the fact that patients often received the PhyReCS-recommended treatment when they perceived no recommendation suggests that the PhyReCS may be able to capture subtle recommendations that patients do not perceive, although future research is clearly needed to more fully examine this possibility.

The PhyReCS could also help researchers examine whether patients are more versus less satisfied with their decisions and/or clinical appointments as a function of the strength of physicians’ recommendations. Patient satisfaction evaluations primarily reflect patients’ perceptions of communication with their healthcare providers;1517 therefore, it is reasonable to assume that patient satisfaction measures will be affected by differences in physician recommendations, which could be captured with the PhyReCS. It is possible that advice may decrease decision satisfaction as people like to feel that they are experiencing free choice,18 and receiving advice can feel like an infringement on this sense of free choice.19 Alternatively, advice may increase decision satisfaction as advice can be an important aspect of coping and patients may feel that their physicians are emotionally supportive when they give recomendations.20 Given that recommendations change patients’ sense of responsibility for a decision and responsibility is known to impact decision satisfaction,21 the PhyReCS could also be used to examine the connection between responsibility and patient decision satisfaction.

There are limitations to the PhyReCS and our study in general. First, although the PhyReCS captures the strength of physician recommendations, it does not capture other aspects of the recommendation, such as the motivation for physicians’ recommendations, which is an important factor when trying to fully understand physician recommendations. It also does not capture whether recommendations were solicited, which is known to influence how people perceive advice.22 Second, we did not collect other measures that would help to establish concurrent validity, such as which treatment(s) physicians believed that they recommended during the appointment. Third, our study has limitations in terms of the generalizability of our results. For example, the study was conducted in the Veterans Affairs system, where patients are older, sicker, and poorer on average,23 which may impact how physicians give recommendations. The PhyReCS may need to be adjusted with other patient populations; for example, the boundaries between recommendation scores may need to be adjusted when physicians are interacting with patients of a higher socioeconomic status. In addition, all patients (and most physicians) were male; given differences in communication styles between male and female physicians,24 future research is needed to examine the reliability and validity of the PhyReCS in clinical settings with female patients and/or physicians. Finally, although we have evidence that our scale is valid, it is possible that another scale would have done an even better job of capturing physician recommendations. Future research is needed to optimize the scale.

In conclusion, although clinical interactions within early stage prostate cancer are nuanced and complex, the PhyReCS makes it possible to capture how physicians recommend multiple treatment options with high reliability and validity. We feel the PhyReCS could allow researchers to more fully examine physician recommendations, an area with significant substantive and theoretical importance.

Acknowledgments

We would like to thank Haley Miller, Margaret Oliver, Elizabeth Reiser, and Biqi Zhang for their help coding. We would also like to thank Valerie Kahn and Daniel Cannochie for their help in data collection and project management.

Financial support for this study was provided in part by the following institutions: a IIR Merit Award from the U.S. Department of Veterans Affairs (IIR 05-283) to Angela Fagerlin, a Health Policy Investigator Award from the Robert Wood Johnson foundation to Peter A. Ubel, and Federal Grant T32 GM007171 to Karen Scherr as part of the Medical Scientist Training Program. All funding agreements ensured the authors’ independence in designing the study, interpreting the data, writing, and publishing the report.

Appendix A. Physician Recommendation Coding System (PhyReCS) Codebook

Purpose

The Physician Recommendation Coding System (PhyReCS) provides researchers with a reliable and valid way to capture the strength with which physicians recommend for or against treatment recommendations during clinical appointments, developed in the context of early stage prostate cancer. By quantifying physicians’ recommendations, the PhyReCS will enable researchers to answer important questions involving physicians’ recommendations, which are a key component of many clinical encounters and medical decision making processes.

Coding overview

For men with early stage prostate cancer, there are three main treatment options: active surveillance, surgery, and radiation (external beam and brachytherapy). Each treatment option receives an independent global treatment recommendation score, which ranges from −2 (strong recommendation against the treatment) to +2 (strong recommendation for the treatment). Recommendation scores are global scores in that coders consider the entirety of the conversation, including the order and context of all statements, when determining recommendation scores.

Descriptions of treatments

  1. Active Surveillance (AS): involves forgoing immediate active treatment and getting regular follow-up tests (prostate specific antigen blood tests, digital rectal exams, and biopsies]. Patients can choose to receive active treatment at any time if there are clinical signs that the cancer is progressing or they simply decide they want to get active treatment.
    1. Note: There is also a more passive form of treatment called “watchful waiting (WW).” Although WW is technically different from AS (WW typically does not include regular biopsies), patients and physicians often use these terms interchangeably. In the (rare) case that the physician differentiates between AS and WW, assign a recommendation score to both treatments separately and note this fact on the coding worksheet. The final “active surveillance recommendation score” will be equivalent to whichever score is higher (AS or WW).
  2. Surgery (Sx): involves removing the prostate (“prostatectomy”). Can be done using two different approaches: open or robotic (sometimes referred to as “laparoscopic”).
    1. Note: If the physician differentially recommends open versus robotic surgery (e.g., “you should not get robotic surgery because your weight is too high but open surgery is a good option for you”), assign a recommendation score to both approaches separately and note this fact on the coding worksheet. The final “surgery recommendation score” will be equivalent to whichever score is higher (open or robotic).
  3. Radiation (Xrt): involves radiating the prostate. Can be done using two different approaches: external beam or brachytherapy (sometimes referred to as “seeds”). Each approach will receive its own score since physicians often differentiate between the appropriateness of these treatments.
    1. External beam radiation: radiation is directed at the prostate from outside patient’s body, The physician may describe initially placing “markers” in the prostate so that they can accurately direct the external beam radiation; this is still considered external beam radiation.
    2. Brachytherapy: radiation comes from “seeds” that are implanted into patients’ prostate.
    3. Note: If physicians are discussing “radiation” in general, this is typically in reference to external beam radiation therapy (although use context clues to interpret). Brachytherapy recommendation scores reflect explicit comments about brachytherapy.
    4. Note: Brachytherapy may be mentioned in passing, particularly if the option is unavailable at that particular treatment site. In these cases, there may not be enough information to assign a recommendation score and the code “not discussed” may be the most appropriate code in terms of whether the physician recommended the treatment option.

Coding instructions

The coder will assign a global recommendation score for each treatment between −2 and +2, with the coder assuming a beginning score of 0 (neutral). Global scores are intended to capture coders’ overall impression of physicians’ recommendations for the main treatment options: active surveillance, surgery, and radiation (both external beam and brachytherapy). The recommendations scores are “at the end of the day” reflections.

Notes:

  • The function of a statement can be more important than the specific words the physician uses. For example, a physician does not have to use the word “recommend” for the statement to function as a recommendation.
  • Each treatment recommendation code is independent of the others. For example, giving an explicit recommendation for surgery does not automatically correspond to a recommendation against active surveillance and/or radiation.
  • Recommendations for/against treatments are inherently connected to the actual treatment choice for a specific patient, not simply a description of the treatment. Physicians can make statements about treatment side effects being undesirable (e.g., there are risks of surgery such as infection), but these statements must be linked to the choice of that treatment option for the patient in order for these statements to affect the recommendation score (e.g., there are risks of surgery, such as infection, and because of your health condition I don’t think it would be a good choice for you.).
  • If the physician does not mention a particular treatment, code it as “not discussed (ND).”
Global Treatment Recommendation (Rec) Score
ND−2−1012
Not discussedStrong rec againstMild rec againstNeutralMild rec forStrong rec for
Physician does not mention the treatment option or provides so little information that recommendation score cannot be assigned.Physician gives a clear, strong rec against treatment option.Physician gives mild, relatively subtle rec against treatment option.Physician portrays treatment as an option; gives no indication of whether this treatment is particularly good or bad for patient.Physician gives mild, relatively subtle rec for treatment option.Physician gives a clear, strong rec for treatment option.

Detailed description of each code

  1. Not discussed
    Physician does not mention treatment option or mentions it so briefly that no recommendation score can be assigned.
    • Often occurs when brachytherapy is unavailable. For example, “What the VA does is they do external beam radiation, meaning instead of putting like little radioactive seeds in your prostate, they shoot radiation at it.” In this excerpt, there is not enough information to code this as a recommendation for or against brachytherapy. If this was the only type of statement about brachytherapy in the appointment, brachytherapy would be coded as “not discussed.”
    • May also occur if physician decides a treatment option (e.g., AS) is not appropriate for patient but does not vocalize that decision. Instead, s/he only mentions the other treatment options (e.g., surgery and radiation) during the appointment.
  2. Strong recommendation against (“This is not a good option.”)
    Physician gives a clear recommendation against treatment option.
    • Physician explicitly states that a patient should not choose treatment option (or options).
    • Physician makes clear value statements about a particular treatment option indicating that it would be bad for the patient.
    • May occur because medical factors make treatment contraindicated (e.g., surgery for obese patient with previous bowel surgery).
    • “I do not recommend” phrasing not required; can be some other clear statement negating this option
  3. Mild recommendation against (“This is a probably not a good option.”)
    Physician identifies treatment as a valid option, but is leaning away from this option. Can suggest negatives about this option, but does not go as far as giving an explicit recommendation against option.
    • Physician makes some negative value statements about the treatment option but does not go as far as giving an explicit recommendation against option.
    • May include physician “softening” a recommendation against the treatment with phrases such as “but it’s your choice” or “but there are other options that you could choose and you have to make the final decision” or “but I’m biased”
  4. Neutral – (“This is a fine option.”)
    Physician identifies treatment as a valid option. The physician may refer to it as “fine” or makes no value statements about the option beyond it being one the patient could choose.
    • Physician is neutral about treatments’ appropriateness for the patient.
    • Physician says treatment option is reasonable, but never qualifies it as a good or bad option.
    • Physician makes no value statements about treatment option beyond identifying it as an option.
  5. Mild recommendation for (“This is a good option.”)
    Physician identifies treatment as a valid option, and is leaning towards this option. Can suggest positives about this option, but does not go as far as giving an explicit recommendation.
    • Physician makes some positive value statements about the treatment option but does not go as far as giving an explicit recommendation.
    • Physician makes some positive value statements about a treatment option, but never rises to the level of a recommendation.
  6. Strong recommendation for (“This is a very good option.”)
    Physician gives a clear recommendation for treatment option.
    • Physician explicitly states that a patient should choose treatment option
    • Physician makes clear value statements about treatment option indicating that it would be good or “the best” for the patient
    • A strong recommendation CAN include statements such as “but it’s your choice” or “but you’re the boss.” A strong recommendation does NOT mean that the physician tells the patient that treatment is the only option.
    • “I recommend” phrasing not required; can be some other clear statement promoting this option

Coding process

  1. For each clinical appointment, you will fill out a “coding worksheet.”
  2. Complete the entire coding process in one sitting – do not take a break and then return to the same appointment later.
  3. All treatments begin with a score of 0. As you read the transcript, the physician may make recommendations for/against particular treatments. As you read these types of statements, adjust the recommendation score up/down as appropriate.
  4. If, by the end of the transcript, there has been no mention of a particular treatment, change the score to “not discussed (ND).”
  5. At the end of the transcript, circle the final recommendation scores (one for each treatment option) that have evolved over the course of the appointment.
  6. “Step back” and make sure that your final recommendation scores reflects the physicians’ overall portrayal of the treatment as a choice for that patient.
  7. You may write notes about a particular treatment option or the coding in general.

Coding Worksheet

Treatment Option Recommendation Coding Subject ID: _________ Coder:_________________

TreatmentRecommendation ScoreNotes
Active Surveillance−2−10+1+2ND
Surgery−2−10+1+2ND
Radiation−2−10+1+2ND
Brachytherapy−2−10+1+2ND

ND = not discussed

Footnotes

*Type of decision aid did not affect the following variables: Patient treatment choice (χ2(2) = 2.93, p = .23), physician recommendation scores (e.g., for active surveillance F(1,252) = .16, p = .69), and patients’ perceived recommendations (χ2(3) = 2.29, p = .51).

We coded brachytherapy recommendations separately from external beam. However, it was only discussed in 35% (90/257) of appointments and received the highest recommendation score in only 5 appointments. “Radiation” recommendation scores thus reflect external beam radiation recommendation scores.

We also calculated Krippendorf’s alpha treating the scores as ordinal variables and results were substantively similar.

§Because only 5 patients received brachytherapy, we collapsed external beam radiation and brachytherapy into a single “radiation” category. Results remain substantively similar if we conduct analyses of brachytherapy and external beam therapy as separate categories.

**We conducted a number of sensitivity analyses to examine the relationship between recommendation scores and treatment received, including using raw recommendation scores and transforming recommendation scores into a single categorical variable. Results were substantively similar.

††Only 5 patients perceived that their physicians recommended brachytherapy; therefore, we collapsed external beam radiation and brachytherapy into a single “radiation” category. Results remain substantively similar if we treat brachytherapy and external beam therapy as separate categories.

Conflict of Interest Disclosures

Peter A. Ubel is a consultant for Humana. The principal investigator and all other co-authors have no conflicts of interest.

Contributor Information

Karen A. Scherr, Fuqua School of Business and School of Medicine, Duke University.

Angela Fagerlin, Departments of Internal Medicine and Psychology, Center for Bioethics and Sciences in Medicine, University of Michigan Ann Arbor, The Ann Arbor VA Venter for Clinical Management Research, Ann Arbor, Michigan.

Lillie D. Williamson, Fuqua School of Business, Duke University.

J. Kelly Davis, Fuqua School of Business, Duke University.

Ilona Fridman, Columbia Business School, Columbia University.

Natalie Atyeo, Duke University.

Peter A. Ubel, Fuqua School of Business, School of Medicine and Sanford School of Public Policy, Duke University.

References

1. Ranerup A. Transforming patients to consumers: evaluating national healthcare portals. Int J Public Sect Manag. 2010;23(4):331–9.
2. Thompson I, Thrasher JB, Aus G, et al. Guideline for the management of clinically localized prostate cancer : 2007 update. J Urol. 2007;177(6):2106–31. [PubMed]
3. Barry MJ, Edgman-Levitan S. Shared decision making – The pinnacle of patient-centered care. N Engl J Med. 2012;366(9):780–1. [PubMed]
4. Davison B, Breckon E. Factors influencing treatment decision making and information preferences of prostate cancer patients on active surveillance. Patient Educ Couns. 2012;87(3):369–74. [PubMed]
5. Gwede CK, Pow-Sang J, Seigne J, et al. Treatment decision-making strategies and influences in patients with localized prostate carcinoma. Cancer. 2005;104(7):1381–90. [PubMed]
6. Hall J, Boyd J, Lippert M, Theodorescu D. Why patients choose prostatectomy or brachytherapy for localized prostate cancer: results of a descriptive survey. Urology. 2003;61(2):402–7. [PubMed]
7. Tourangeau R, Rips LJ, Rasinski K. The psychology of survey response. Cambridge University Press. 2000
8. Loftus E, Smith K, Klinger M, Fiedler J. Memory and mis-memory for health events. In: Tanur J, editor. Questions about questions: Inquiries into the cognitive bases of surveys. New York: Russell Sage Foundation; 1992. pp. 102–37.
9. Kunda Z. The case for motivated reasoning. Psychol Bull. 1990;108(3):480–98. [PubMed]
10. Veterans Affairs Office of Research and Development. ClinicalTrials.gov [Internet] Bethesda (MD): National Library of Medicine (US); 2000. [cited 2016 Jun 16]. Available from: http://clinicaltrials.gov/show/NCT00432601.
11. Patton MQ. Qualitative evaluation and research methods. 2nd. Newbury Park, CA: Sage Publications; 1990.
12. Hallgren K. Computing inter-rater reliability for observational data: an overview and tutorial. Tutor Quant Methods Psychol. 2012;8(1):23–34. [PMC free article] [PubMed]
13. Hayes AF, Krippendorff K. Answering the Call for a Standard Reliability Measure for Coding Data. Commun Methods Meas. 2007;1(1):77–89.
14. Fisher B, Anderson S, Bryant J, et al. Twenty-Year Follow-Up of a Randomized Trial Comparing Total for the Treatment of Invasive Breast Cancer. English J. 2002;347(16):1233–41. [PubMed]
15. Manary M, Boulding W, Staelin R, Glickman S. The patient experience and health outcomes. N Engl J Med. 2013;368(3):201–3. [PubMed]
16. Boulding W, Glickman SW, Manary MP, Schulman KA, Staelin R. Relationship between patient satisfaction with inpatient care and hospital readmission within 30 days. Am J Manag Care. 2011;17(1):41–8. [PubMed]
17. Glickman SW, Boulding W, Manary M, et al. Patient satisfaction and its relationship with clinical quality and inpatient mortality in acute myocardial infarction. Circ Cardiovasc Qual Outcomes. 2010;3(2):188–95. [PubMed]
18. Botti S, Lyengar SS. The psychological pleasure and pain of choosing: when people prefer choosing at the cost of subsequent outcome satisfaction. J Pers Soc Psychol. 2004;87(3):312–26. [PubMed]
19. Fitzsimons GJ, Lehmann DR. Reactance to Recommendations: When Unsolicited Advice Yields Contrary Responses. Mark Sci. 2004;23(1):82–94.
20. Goldsmith DJ, Fitch K. The normative context of advice as social support. Hum Commun Res. 1997;23(4):454–76.
21. Botti S, McGill A. When choosing is not deciding: The effect of perceived responsibility on satisfaction. J Consum Res. 2006;33(2):211–9.
22. Dalal RS, Bonaccio S. What types of advice do decision-makers prefer? Organ Behav Hum Decis Process. 2010;112(1):11–23.
23. Oliver A. The Veterans Health Administration: an American success story? Milbank Q. 2007;85(1):5–35. [PubMed]
24. Roter DL, Hall JA. Physician gender and patient-centered communication: a critical review of empirical research. Annu Rev Public Health. 2004;25:497–519. [PubMed]