|Home | About | Journals | Submit | Contact Us | Français|
To assess the content of counseling about prostate-specific antigen (PSA) screening. Guidelines recommend informed consent before screening because of concerns about benefits versus risks. As part of the professional practice standard for informed consent, clinicians should include content customarily provided by experts.
40 transcripts of conversations between medicine residents and standardized patients were abstracted using an instrument derived from an expert Delphi panel that ranked 10 “facts that experts believe men ought to know.”
Transcripts contained definite criteria for an average of 1.7 facts, and either definite or partial criteria for 5.1 facts. Second- and third-year residents presented more facts than interns (p=0.01). The most common facts were “false positive PSA tests can occur” and “use of the PSA test as a screening test is controversial.” There was an r=0.88 correlation between inclusion by residents and the experts’ ranking.
Counseling varied but most transcripts included some expert-recommended facts. The absence of other facts could be a quality deficit or an effort to prioritize messages and lessen cognitive demands on the patient.
Clinicians should adapt counseling for each patient, but our abstraction approach may help to assess the quality of informed consent over larger populations.
Communication is said to be the “main ingredient” of medical care ; effective communication is associated with improvements in patients’ and clinicians’ mutual understanding as well as in relationships, satisfaction, patient adherence, and medical outcomes [1–7]. Problems with communication have been found in a wide variety of health care settings [5–13], but most efforts to improve communication have been limited to educational programs [13–16]. Since most methods developed for training programs require more resources than would be feasible across the entire population of health care providers and patients, we have been adapting methods from Quality Improvement for use with communication [17–26]. Quality Improvement has a successful track record for affordably improving physician behavior across entire states [11, 12], or what some call a “population scale” . To improve access to communication services across entire populations, new communication assessment methods will need to function on a lean budget and be simple enough for use by personnel without much experience in communication skills training .
This analysis of medicine residents’ counseling about prostate cancer screening is part of a larger effort to develop communication assessment tools that are suitable for use on a population scale [17–26]. We use a quality indicator strategy to operationalize individual communication behaviors. Quality indicators are quantitatively reliable variables that each represents a small part of total quality that can be tracked and assessed again after an intervention [28, 29]. We have previously reported communication quality indicators operationalizing how clinicians conduct their communication, ranging from jargon usage and assessment of understanding to discussion about potential emotions [19–25]. In this paper we introduce a communication quality indicator group that operationalizes whether content messages included in counseling are consistent with recommendations from experts. Making comparisons between clinicians’ counseling and experts’ recommendations is consistent with a “professional practice” legal standard for informed consent, which holds that patients should be given the same information customarily disclosed by expert physicians for similar patients’ best interests .
We chose to study communication about prostate cancer screening because professional organizations recommend routine counseling about its potential risks [31–34]. Counseling is recommended for screening because there is no high-level evidence to suggest that screening with the prostate-specific antigen (PSA) test decreases morbidity or mortality [34–39]. Informing men does not always influence decision-making [40–45], but may have legal or intrinsic ethical value and is also important for helping men to understand the danger of prostate cancer and to be mentally prepared for an abnormal screening result [32, 33]. As with our other quality indicators [17–25], we chose to study counseling in resident physicians for feasibility reasons and to obtain a comparison sample for later studies of clinicians’ counseling after formal education is complete .
In addition to investigating communication by resident physicians, this study has two other purposes. First, the paper presents a methodological advancement, in that we derived the indicator criteria from the results of a previously published Delphi panel by Chan and Sulmasy . Our previous studies of content of counseling after newborn screening used individual consultations with experts [17, 18]. Delphi-derived data have the advantage of being anonymously collected, less prone to bias, and ranked by priority .
Second, we designed this study to pilot the use of the structured implicit review method for communication transcripts, instead of the explicit criteria abstraction approach we have used for other communication quality indicators [19–24]. These approaches are adapted from quality improvement techniques for review of medical records . In explicit criteria abstraction, chart reviewers search for objective features of medical care, following a data dictionary containing explicitly detailed definitions and examples. In structured implicit review, clinically knowledgeable abstractors are asked to make judgments about specific aspects of quality, using survey-like questions as a guide. Structured implicit review enables abstractors to make inferences about causation and clinicians’ motivations, and to identify nuances that might be missed by the more precise and quantitatively reliable method called explicit-criteria abstraction . Explicit criteria abstraction generally has better quantitative reliability [19–24] than structured implicit review, but for this project we chose structured implicit abstraction because its greater flexibility allowed us to incorporate the entire Chan and Sulmasy list without having to infer beyond the Delphi panel’s wording. Use of structured implicit methods also allowed us to investigate whether structured implicit review for communication performs similarly to when used in traditional Quality Improvement.
For this study we abstracted transcripts of conversations between internal medicine residents and standardized patients portrayed to have a question about prostate cancer screening. Transcripts were made from tapes collected during four workshops in a Primary Care Internal Medicine residency program. The workshops were part of the educational curriculum, but residents were asked to give informed consent and were offered a chance to decline use of their tapes for research. Methods were approved by institutional review boards at Yale and the Medical College of Wisconsin.
Before the didactic portion of each workshop, residents were taped in a standardized patient encounter, in which a 50-year old man asked about prostate cancer screening. Encounters were done in the residents’ actual continuity clinic environment, and were not observed “live” by the peers or attending physicians. A handout stated that the patient had no family history of cancer and had had an unremarkable physical exam the week before, so the resident would not feel obliged to do a physical or take an extended history. The handout did not contain any suggestions about how to discuss the screening test, and none of the residents had previously been taught about or trained in facts recommended by the Chan and Sulmasy Delphi group.
Following the techniques of our Brief Standardized Communication Assessment (BSCA) tool for focusing data collection , patients began with a short speech patterned after the following example:
I’m sorry I’m back so soon after my physical, but I had to leave so quickly that I didn’t get a chance to ask a question. I recently saw an advertisement about prostate cancer screening, but I wasn’t sure if it was for me. What do you think?
To standardize the counseling task, patients were coached to avoid asking leading questions and to minimize the appearance of anxiety or confusion. All standardized patients were men and chosen to plausibly depict the 50-year old age of the man in the script. In order to standardize the counseling task, all patients were Caucasian.
The tapes were transcribed verbatim and proofread for accuracy by a board-certified internist (MF or JS). To lessen abstractor bias, all names and other personally identifying information were removed from the transcription during the proofreading process. There was a final sample of 40 transcripts for this analysis.
Our abstraction methods used a content message identification instrument derived from a study by Chan and Sulmasy . As part of that study, a Delphi panel of national experts on prostate cancer (6 urologists and 6 non-urologists) was convened to identify and vote on priorities for a list of ten “key facts experts believe men ought to know” about PSA screening before giving consent. As a result of the Delphi method, the list is ranked so that the top facts had received more of the experts’ high priority ratings than the lower-numbered facts (Table 1). Since 6 of the facts contained two related but distinct concepts and another fact contained 3 concepts, for abstraction purposes we parsed each fact into individual messages, the final number of which was 18. For a separate analysis we added another 16 messages derived from some additional lists described by Chan and Sulmasy. The final abstraction instrument therefore consisted of 34 separate messages that could be consolidated back into the original expert-recommended facts. The abstraction instrument was designed to facilitate the structured implicit review method as described in the Introduction.
To focus the abstraction procedure, abstractors were instructed to read and abstract the transcript one sentence at a time. Individual instances of each message were marked in the transcript, but in contrast to our previous analyses of individual statements [17–24], the final unit for this analysis was the entire transcript, i.e. whether each of the 34 messages was present somewhere in the transcript.
It is important to recognize that transcript abstraction is a much more targeted technique than the open-ended coding method referred to as “qualitative” analysis. As with their counterparts in Quality Improvement, communication abstractors read quickly through the transcript looking for only very specific content or conduct communication behaviors. In comparison, qualitative methods take much longer, are less reliable, and require more expertise than would be feasible for population-scale use, even though they might provide richer descriptions of interaction between clinician and patient.
To allow for partially ambiguous statements and help calculate inter-abstractor reliability, the instrument allowed the abstractors to assign the message variables either with “definite” or “partial” designations. After abstraction was complete, the definite and partial codings were consolidated back into the 10 facts that had been recommended by the Delphi panel of experts. For our analysis to accept a fact as “definite,” 2 abstractors had to have designated all component messages as definite, or 1 abstractor as definite and the other abstractor partial. Each transcript was reviewed by at least 2 abstractors; 23% of transcripts were reviewed by 3 abstractors. Abstraction data from every third transcript were discussed by the abstractors for quality control purposes, following the suggestion by Feinstein .
Data were analyzed using 1-way ANOVA for continuous responses to categorical variables and the Chi-squared test for grouped categorical responses. Analyses were done using JMP software (SAS Institute, Cary, NC, USA).
Inter-abstractor reliability was calculated from the individual 34 concept variables using a weighted adaptation of Cohen’s method  for the definite/partial/absent coding schema. For perfect agreement in this method, both abstractors had to code the transcript in exactly the same way (definite or partial or absent). If abstractors split on definite versus partial ratings, half of an agreement was included in the calculation, although a full potential agreement was still used in the denominator for the Cohen correction for chance.
Descriptive data on the participants (Table 2) were similar to those of the population of the residency program at the time of the study. The interviews lasted an average of 10.4 minutes (SD 4.5, skew 0.65).
For abstractors’ coding of the 34 content messages over the whole project, there were 1654 out of a possible 1972 agreements (κ=0.64). The median and maximum κ coefficients for the 34 individual messages were 0.51 and 0.82 respectively, consistent with our expectations for the structured implicit abstraction method. To assess feasibility of our methods we tracked expenses. The entire project was done for less than $50 per transcript, most of which was for transcription. Use of the abstraction instrument took less than 5 minutes of abstractor time per transcript.
Using a strict approach of only counting facts that met “definite” criteria for the abstractors, the average total number of facts included was 1.7 per transcript (SD 1.2). As shown in the histogram in Figure 1, most transcripts included 3 or fewer facts. Eight transcripts failed to meet definite criteria for any of the recommended facts. No transcripts contained more than 5 facts.
Transcripts contained an average of 3.4 content messages that met partial criteria because of partial overlap with the Delphi panel’s recommendations. When these partial-criteria facts were included in the calculation, the average number of facts identified increased to 5.1 facts per transcript (SD 2.1). One transcript was even found to include either definite or partial criteria for all 10 facts. A histogram of the number of definite plus partial facts per transcript has a roughly normal distribution (Figure 2).
Table 3 shows the number and percentage of transcripts including each fact when the analysis included either definite criteria or definite plus partial criteria. There was a positive correlation between the number of transcripts with definite-criteria facts and the facts’ ranking by the experts (r=0.88, p<0.001). The most common fact included was the fourth-ranked (“False positive PSA tests can occur”) seen with definite criteria in 58% of transcripts and with partial criteria in another 35% of transcripts. The experts’ top-ranked fact (“The use of the PSA test as a screening test is controversial”) was the second most common in the transcripts, with definite criteria in 33% of transcripts and partial criteria in another 50%. Definite criteria for the experts’ second-ranked fact were seen in only 3 transcripts (8%).
When both definite and partial criteria facts were considered, we found that second- and third-year residents tended to include more facts in counseling (averages 5.3 and 6.3) than first-year residents (average 3.2, p=0.005). No significant difference was apparent between male and female residents (average 5.4 versus 5.0).
When only definite-criteria facts were included, there was a mildly positive correlation between duration of the transcript and the number of facts (r=0.32, p=0.04). The correlation increased to 0.54 (p=0.003) with the addition of the partial-criteria facts. This moderate correlation suggested to us that longer conversations might devote much of their additional duration to longer discussions of the same content messages, or else to conduct-related behaviors such as assessment of understanding.
Effective communication before PSA screening is consistent with guidelines and will help men to be mentally prepared for an abnormal result [31–34]. In this paper we introduce a communication quality indicator for assessing the content of counseling likely to be provided. Previously we have developed communication quality indicators pertaining to several important communication behaviors, including other types of content messages [17, 18], jargon usage and explanation [21, 22], assessment of understanding [23, 24], speech complexity , and discussion about potential emotions [19, 20]. This paper demonstrates two new methodological options for communication quality indicators: derivation of abstraction criteria from a Delphi panel of experts  rather than from consultants’ expert opinions , and use of a structured implicit review technique for those analyses where abstractor flexibility is needed for the analysis aims.
In our demonstration sample of residents we found that the content of counseling varied widely but often lacked facts that were highly recommended by the expert Delphi panel. The addition of partial criteria to the analysis suggested that many residents may touch briefly on some of the expert-recommended facts, but without enough specificity to convey the entire concept to our abstractors. These findings could amount to a problem with quality of communication, in which case feedback about being more specific may help clinicians like our residents to be more effective communicators. Alternately, the residents could have been intentionally holding to the time limits that they normally face in clinic. The residents may also have prioritized their content messages because of an understanding that only a limited number of messages can be successfully learned in one visit . Prioritization would be consistent with “first visit bias,” a possible limitation of standardized patient assessment methods .
Generalizability from this demonstration analysis is limited by the use of a small sample of residents from a single program, so further research at other programs and with physicians working outside of academic settings should be conducted before more specific conclusions can be made. Some questions about communication quality may be answered by the implementation of communication assessment methods over an entire population of clinicians and patients, such as our ongoing statewide analysis of processes and outcomes of communication after routine newborn screening . To make communication assessment possible on this population scale, we designed the method to be affordable enough to work on a lean budget and simple enough to be implemented by Quality Improvement personnel without communication research training. To increase acceptability to clinicians, data collection is brief enough to reduce annoyance for busy clinicians, standardized enough to make comparisons fair, and transparent enough for clinicians to understand how we arrived at their assessment result. Feasibility for standardized patients on a population scale is advanced by use over the telephone . Detailed descriptions of Communication Quality Assurance methods will be addressed in a forthcoming series of papers.
Generalizability from our demonstration analysis may also be reduced by the use of standardized patient encounters instead of recording encounters with actual patients. We chose to use simulation methods because they avoid logistical, privacy, and consent issues that would pose challenges for population-scale use. A limitation of standardized patient methods is that they may prompt clinicians to act on their best behavior due to a Hawthorne-like sense of observation. Future communication quality indicator analyses may be able to use recordings of encounters with actual patients, or with unannounced standardized patients.
A significant advantage of standardized patients and our BSCA approach in particular, is that variation can be minimized so as to enable fairer comparisons and ranking of clinicians. The Hawthorne effect may reduce cross-clinician variation due to clinician effort, and the instructions provided to BSCA patients may reduce variation due to patient differences or the attitudes of traditional standardized patients. The resulting uniformity across clinicians allows an equal footing comparison of communication competence that would be impossible in analyses of actual encounters or unannounced simulated patients . We believe that assessment of competence, which is a necessary prerequisite for performance , will provide a cost-effective strategy for improvement until research from us and others suggests that a global improvement in communication competence is being achieved. In colloquial terms, we see communication competence as the current version of “low hanging fruit” that is often sought in traditional Quality Improvement. In the meantime, our newborn screening research will compare competence data with patients’ communication outcomes . Further research may also help to understand whether such strict data collection and abstraction procedures will be valuable as a supplement for the plausibility and flexibility needed for medical education.
Our experience in this project with structured implicit review methods for communication transcripts was about what we had expected. We chose structured implicit methods for our Delphi-derived quality indicator because it allowed us to be faithful to the experts’ original wording , and to evaluate the effect on reliability from reducing the explicitness of abstraction criteria used in our previous projects [17–24]. As expected, we found that reliability was less than we have experienced with our explicit criteria abstraction indicators. On the other hand, reliability was consistent with reliability often seen with quality improvement projects that incorporate structured implicit review of medical records [25, 47, 54]. The ideal quality indicator uses explicit criteria abstraction , but projects in both traditional Quality Improvement and Communication Quality Assurance the use of structured implicit review in abstraction may be an acceptable tradeoff when flexibility is needed for a given research topic [47,54].
Further research will be needed to develop more concrete guidelines for communication content, and guidance for how clinicians may plan out their counseling about specific issues. There is no consensus on how many messages need to be delivered for a patient to be “fully” informed about PSA screening, but conversations including few or none of the Chan and Sulmasy facts may be inconsistent with the professional practice standard for informed consent prior to PSA screening. For individual patients, the question of how many concepts to present depends on factors such as the patient’s levels of interest, attention, and health literacy. Regardless of the number of content messages included in counseling, clinicians are advised to use assessment of understanding questions to determine whether the messages were presented effectively [23, 24]. In addition, such methods will be accompanied by our other communication quality indicators, so that those methods’ greater reliability will enhance the fairness of comparisons across clinicians.
The data from this project are preliminary, but they suggest that problems may exist with the content domain of communication quality prior to PSA screening. The new method for comparing an aspect of counseling against Delphi-derived practice standard appears to be affordable, feasible for use on a population scale, and comparably reliable with similar analysis in traditional Quality Improvement. Further research and development needs to be done, but methodological innovations hold promise for a nascent field of population-scale Communication Quality Assurance.
We believe that if informed consent is worth doing, it is worth doing well. We hope that our observations of content messages included in pre-PSA counseling will expand awareness in clinicians and medical educators about the difficulties of practical counseling in the office settings. Even more so, we hope that implementation of population-scale assessment methods will enable improvement of communication on a same scale as health care itself. Improvements in the content of counseling may improve the experience of health care and promote the type of informed decision-making that guidelines recommend.
The authors are grateful to Dr. Stephen Huot and to the faculty and residents of the Yale University Primary Care Internal Medicine Residency Program. Dr. Farrell is supported in part by grants K01HL072530 and R01HL086691 from the National Heart, Lung, and Blood Institute. The authors do not have any actual or potential conflicts of interest to declare; there are no personal or other relationships with other people or organizations that could inappropriately influence, or be perceived to influence, this research.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.