|Home | About | Journals | Submit | Contact Us | Français|
The Consultation Quality Index (CQI) is a holistic quality marker for GPs based on patient enablement, continuity of the care and consultation length.
To evaluate the CQI-2, a new version of the CQI incorporating a process measure of GP empathy (the Consultation and Relational Empathy Measure).
Cross-sectional questionnaire study.
General practice in the west of Scotland.
Empathy, enablement, continuity, and consultation length were measured in 3044 consultations involving 26 GPs in 26 different practices in the west of Scotland. CQI-2 scores were calculated and correlated with additional data on GPs' and patients' attitudes. Comparisons were also made with the UK–wide data from which the original CQI had been calculated.
CQI-2 scores were independent of deprivation, access, demographics, and case-mix. GPs with lower CQI-2 scores valued empathy and longer consultations less than these GPs with higher CQI-2 scores. ‘Below average CQI-2’ GPs (those in the bottom 25%) also felt less valued by patients and colleagues. Patients' showed less confidence in and gained less satisfaction from these doctors. Data ranges from the study were comparable with the UK data ranges used to construct the original CQI.
The CQI-2 is a new measure of holistic interpersonal care. In a small but representative sample of GPs it appears to differentiate between below and above average doctors. CQI-2 scores may reflect important aspects of morale, core values and patient-centred care. There may be potential for its use as part of professional development and as a component of the general medical services contract.
Quality of care in clinical medicine can be conceptualised as the integration of access to care and effectiveness of care.1 In turn, effectiveness of care is regarded as the interaction between technical effectiveness and interpersonal effectiveness. Interpersonal effectiveness is hard to define in a way that lends itself to measurement, yet it is widely regarded as being one of the core defining attributes of the good GP.2,3 It allows diagnoses to be made in holistic (bio–psycho–social) terms,3,4 and is achieved through what is now usually defined as patient-centred practice5 — the amalgamation of appropriate consulting skills and styles, the identification of patients' priorities and concerns, and the involvement of patients (as much as they wish to be involved) in decision making about the management of their own health problems. The extensive literature on these issues has recently been reviewed.3
The Consultation Quality Index (CQI)6 was developed from a programme of research into the distribution and determinants of quality of care at routine general practice consultations. The CQI is based on three measures:
The CQI has been used to describe quality of care being provided by both practices and individual doctors, and has proved able to identify strengths and weaknesses in both.6 A suggestion has also previously been made for a way in which it could be used as an incentive/reward payment.
Donabedian has suggested that an ideal quality measure should include elements of process, structure and outcome relevant to the attribute being studied.9 The CQI already arguably includes a ‘process’ component (consultation length), a ‘structure’ component (continuity of care) and an ‘outcome’ component (enablement). However, a new process measure of the consultation called the CARE (Consultation and Relational Empathy) measure has recently been developed and validated in primary care.10–12 A major strength of this new measure is that it has been carefully developed and validated (both qualitatively and quantitatively) as a tool that is meaningful to patients across the socioeconomic spectrum.11,12 The theoretical considerations regarding empathy in the clinical context and the development of the CARE measure have been discussed previously.10
This paper reports the development of a new composite measure of effective interpersonal care based on integrating the CQI and CARE measures, in an attempt to measure meaningful aspects of consultation structure, process, and outcome in one ‘holistic tool’ For the purposes of this paper, the new measure is referred to as CQI-2.
The instruments used to construct the CQI-2 are shown in Supplementary Appendix 1. For the rest of this paper, ‘empathy’ is the score taken from the CARE measure (Supplementary Appendix 1, question 5) completed by the patient immediately after the consultation. ‘Enablement’ is the score taken from the PEI (Supplementary Appendix 1, question 1) completed by the patient also immediately after the consultation. ‘Continuity’ is derived from responses to the question ‘how well do you know the doctor you are going to see (or have seen) at the consultation’ (Supplementary Appendix 1, question 4). ‘Consultation length’ means the length of time from the doctor starting the consultation to the point where the doctor feels that the consultation has ended.
Interpersonal effectiveness in the consultation is an important, but hard to measure, aspect of quality of care. The Consultation Quality Index (CQI) was devised as a composite tool comprising patient enablement (a consultation outcome measure), continuity of care, and consultation length. The development and validation of a new process measure of physician empathy in the consultation (the CARE measure) has allowed us to combine all four measures into a new measure, the CQI-2. Preliminary data from 26 GPs in the west of Scotland suggests that the CQI-2 may be a useful tool in identifying doctors with below average, average, or above average interpersonal skills, may have utility in appraisal and revalidation, and could be used as the basis for an incentive payment in the GMS contract aimed at rewarding good interpersonal/holistic care.
Empathy is scored from 10–50 as previously described.12 Enablement is scored on a scale of 0–12 as previously described.6 Continuity is scored as the proportion of consultations rated as 4 or 5 in question 4 of the Supplementary Appendix 1. Consultation length is measured in real time; in this study it was collected by doctors recording the exact start and finish time of each face-to-face consultation using the time displayed on their computer screens.
The first of two main data sets used, included 3044 consultations carried out by 26 GPs in 26 medium-sized (three to four partners) non-training practices in the west of Scotland. The practices were drawn to include doctors from both deprived and non-deprived areas across four health board regions. Details of the sampling frame and representativeness of the study have been presented elsewhere.11 Practice characteristics (number of partners, list-size, practice deprivation score), and GP workload (hours per week spent consulting, number of consultations per week, number of daytime house visits, number of daytime telephone consultations) were documented. Patient information was collected on a target of a minimum of 100 consecutive, unselected consultations from each doctor. Empathy, enablement, continuity, and consultation length were recorded as referred to above. Patients were also asked whether they would recommend the doctor they had seen to family and friends, and about their overall satisfaction with their consultation (from the General Practice Assessment Questionnaire (GPAQ13). Before their consultation, patients were asked about access issues (booking time and waiting time13) and about their confidence in the doctor they were about to see.14 Other patient details were also collected including age; sex; ethnicity; marital status; employment status; home ownership; age at leaving full-time education; postcode deprivation score; reason for consulting; consulting for a new or long-standing problem; number of problems to discuss; GHQ-caseness (cut-off score; ≥5); long-standing illness or disability; comorbidity; general health over the past 12 months; and consultations with a GP over the last 12 months.
After the study but before results had been revealed, the participating doctors self-completed the same empathy measure (using it as an overall self-assessment), scoring how they thought they generally performed for each item, using the same scale as the patients (poor to excellent). They also completed an additional item asking them to rate the importance of empathy in everyday consultations (from ‘not important’ to ‘very important’ on a 4-point scale). They completed the Morale in General Practice Inventory (MAGPI) instrument15 designed to assess different aspects of GPs' morale, and were asked to document what their optimal average consultation length would be (if more time were available to spend with patients).
The second data set included information on at least 50 consecutive adult consultations carried out by 171 randomly selected doctors in four contrasting areas of the UK (west London, Oxfordshire, Coventry and Lothian).8 The data included enablement, continuity, and consultation length.
In both studies, as an indicator of ethnicity, patients were asked to record what languages they spoke at home in addition to English. The data used in this study excluded patients who spoke ‘other languages’ at home as previous work has indicated differences in the relationship between consultation length and patient enablement in ‘other language’ patients.8
The original CQI was based on the UK data for enablement, continuity, and consultation length. Doctors' mean consultation scores for each item were divided into six equal sextiles.6 (The boundary values for each sextile have been published elsewhere6 and are reproduced in Table 4). Six points were awarded for a score in the top sextile, reducing to one point for a score in the sixth sextile. Scores were added to give the total CQI score, which can thus lie between 3 and 186.
CQI-2 was calculated in the same way for the 26 doctors in the west of Scotland using sextiles based on the west of Scotland data set. The addition of the new fourth component (empathy) gives a possible score range for CQI-2 of 4–24.
The boundary values for the west of Scotland and UK sextiles were compared using Pearson's correlation coefficient (Table 4). Given the anxiety that doctors might record consultation length inaccurately if it were being used as a performance measure, a version of CQI-2 excluding the item on consultation length was calculated and compared with the complete 4-item CQI-2, again using Pearson's correlations.
Empathy, enablement, continuity, consultation length, and CQI-2 scores for the 26 doctors were normally distributed and were therefore correlated with each other using Pearson's correlations. CQI-2 and its four components were correlated with the other available doctor and patient data using Spearman's ρ, as many of these other variables were not normally distributed
The 26 doctors were divided into three groups with cut-points based on the inter-quartile range. This resulted in an ‘above average CQI-2’ group of six doctors (729 patients), an ‘intermediate CQI-2’ group of 14 doctors (1557 patients) and a ‘below average CQI-2’ group of six doctors (758 patients). Mean values for variables with significant or near significant rank correlations with doctors or patients' views were then cross-tabulated with these groups of high, medium and low CQI-2 scoring doctors. Statistical differences between bands were assessed by Kendall's τ. Because of the low numbers in each group the scores for the individual items of the MAGPI instrument, responses were collapsed into two categories for each item (‘yes’ = MAGPI item score of 2 or 3, ‘no’ = MAGPI item score of 1).
The CQI-2 was calculated for the 26 west of Scotland doctors as described above. The number of patients per GP ranged from 56 to 131, exceeding the minimum of 50 patients per doctor that our previous work has shown to be necessary to calculate a stable estimate of mean GP scores.6,12
Scores for individual GPs ranged from 5 to 23, with a mean value of 13.7 (95% confidence interval [CI] = 11.8 to 15.7). There was no significant difference in mean CQI-2 score between the doctors working in the high deprivation areas (mean score 13.7) and those working in low deprivation areas (mean score 13.8). Female and male doctors scored similarly (14.5 versus 13.1, P = 0.46, independent t-test).
The relationships between the CQI-2 and its individual components are shown in Table 1. Each component correlated significantly with total CQI-2 score (r between 0.61 and 0.89). The correlation between empathy and enablement was highly significant (P<0.001), but the other inter-correlations between individual components were generally of lower significance, probably in part due to the small number of doctors involved.
To examine the ability of the CQI-2 to discriminate between doctors in terms of interpersonal effectiveness, we carried out correlations for CQI-2 scores of all 26 doctors against the other variables collected in the west of Scotland. Those correlations which were significant are shown in Table 2. There were no significant correlations between CQI-2 (or its components of empathy, enablement, continuity, and consultation length) and documented workload, or access (percentage of patients seen within 2 days, percentage of patients taken on time; results not shown). However, doctors' views on the importance of empathy, and their estimate of their own empathy, both correlated with patients' ratings of the doctors' empathy and patient enablement. Self-rated empathy also correlated significantly with CQI-2 score. Although doctors' overall morale score (as measured by the MAGPI) was not significantly associated with CQI-2 score (ρ = 0.018), several predicted ‘relational’ items of the MAGPI were: doctors who felt that their patients didn't value the job they did for them had lower CQI-2 scores; and doctors who found it hard to balance work with home life had higher CQI-2 scores. In addition, doctors' view of ideal consultation length (the average amount of time they would like to have with each patient) was also positively correlated with CQI-2 score. CQI-2 scores and its component parts also showed highly significant correlations with mean patient scores for confidence in the doctor, whether they would recommend the doctor to others, and their overall satisfaction.
The doctors were divided into three groups: above average CQI-2 scorers (top quartile of 6 doctors; CQI-2 scores = 17–23); intermediate CQI-2 scorers (14 doctors; scores = 10–16); and below average CQI-2 scorers (bottom quartile of 6 doctors; CQI-2 scores = 5–9). The aim of this grouping was to separate out the doctors at the two ends of the spectrum, that is, those with ‘below average’ and those ‘above average’ scores. The mean CQI-2 scores for each the three groupings were 7.3 (95% CI = 5.7 to 8.9), 13.7 (95% CI = 12.6 to 14.9), and 20.3 (95% CI = 17.6 to 22.8), respectively.
Table 3 shows that there was again a significant difference between the three CQI-2 groups on three aspects of GP morale (from the MAGPI); ‘my patients think I do a good job for them’, and ‘my colleagues value me’ (which were positively related to CQI-2 score) and home and work life in balance, which was negatively related (note that two of the 14 doctors in the intermediate CQI-2 group did not return the GP questionnaire and thus results are calculated as a percentage of the 12 GPs who did complete the questionnaire). Doctors in the below average CQI-2 band saw less need for empathy or for longer consultations. Table 3 also shows that patients' confidence in their doctor, the likelihood that patients would recommend the doctor to friends and family, and their overall satisfaction with the consultation all showed a significant gradient across the three CQI-2 groups.
There were no significant differences between the three CQI-2 groups in practice characteristics (number of partners, list-size and practice deprivation score), documented workload or access (percentage of patients seen within 2 days and percentage of patients taken on time) (results not shown). Similarly, there were no differences between groups in patient characteristics (age, sex, ethnicity, marital status, deprivation indicators, or case-mix, reason for consulting, consulting for a new or long-standing problem, number of problems to discuss, GHQ-caseness, long-standing illness or disability, comorbidity, general health over the past 12 months, consultations with a GP over the last 12 months) (results not shown).
Table 4 compares the original UK and new west of Scotland CQI sextiles for enablement, continuity, and consultation length. It can be seen that the boundary values from the two data sets overlap completely for enablement and for consultation length and are not dissimilar for continuity — the proportions of patients who ‘know the doctor well’. (The differences in the latter categorisation possibly reflect the absence of large practices in the west of Scotland sample, but possibly also the fact that the UK continuity data was collected before consultations whereas the west of Scotland data was collected after consultations). The correlation between CQI-2 scores calculated using west of Scotland sextiles and UK sextiles was r = 0.96
CQI-2 is an improved measure of quality of interpersonal care at routine general practice consultations. It may have utility in appraisal and revalidation and could potentially be introduced into the GMS contract as a reward for doctors who are providing holistic and high quality interpersonal care to their patients.
Donabedian has recommended that quality measures should contain components which cross-relate and draw from items representing structure, process and outcome.9 CQI-2 does this, and by including the CARE measure — which correlates with all three components of the original CQI — it materially strengthens it as a measure of the interpersonal effectiveness of doctors. Patients consistently rank empathy and humanness as a key attribute of a ‘good doctor’.10,16 Furthermore, the broad-based definition of empathy used in the development of the CARE measure incorporates (within the 10 items of the measure) the main competences expected of doctors who exhibit ‘patient-centred’ consultation skills.3,5 CQI-2 in the present study proved able to detect differences between doctors (both in terms of doctors' views and patients' views) even on a small sample size. Data from a larger sample is needed before concluding whether the single item CARE measure or the 4-item CQI-2 strikes the better balance between reliability and practicality as a quality measure for interpersonal effectiveness.
The original CQI appeared able to identify a small number of low-scoring doctors who proved to have either health problems or problems of low job satisfaction, both of which could be postulated as contributing to below-optimal consultations.6 In looking at these other variables, our hypothesis, based on previous work.17,18 was that doctors with higher CQI-2 scores (that is, those who give longer consultations, provide better continuity, are more empathic, and enable patients more) would value ‘therapeutic relationship’ more and that this would be apparent in associations between their self-ratings of empathy, the importance they attach to empathy and consultation length, and aspects of morale specifically relating to ‘therapeutic engagement’ with patients (MAGPI item 10) and perhaps also with colleagues (MAGPI items 11). Low scores on CQI-2 have indeed been associated with issues relating to low morale in terms of relationships with colleague and patients alike. The fact that low CQI-2 scorers are less likely to report difficulties with work–life balance suggests that such doctors may have ‘disengaged’ from their work. Previous studies have shown that doctors who value (and provide) empathic consultations are more satisfied in their work;9 it is of interest in the present study that the low-CQI-2 group of GPs both valued empathy less and rated their empathy lower than the GPs in the higher CQI-2 groups. Clearly caution must be used in assigning causality between the associations we have found, especially those of weak statistical significance (P>0.01), and further larger studies are required to confirm our findings.
The 2004 UK GMS contract introduced financial incentives for ‘evidence-based practice’. These are awarded largely for technical rather than interpersonal effectiveness, possibly reflecting the ease of measurement of technical effectiveness compared with interpersonal effectiveness. A recent BMJ essay2 expressed what many feel — that the emphasis on ‘technical care’ incentives in the contract has been at the expense of the ability to deliver optimum ‘interpersonal care’.19 We have previously suggested a mechanism where an element of available incentive-payment monies could be redistributed from low CQI scorers to high CQI scorers6 and believe that CQI-2 could potentially be used in the same way in order to correct what many regard as a current imbalance of incentives between technical and interpersonal aspects of care.
For all current and potential measures of interpersonal effectiveness (including the instruments presently in use for carrying out the approved ‘patient surveys’), further work is required on patients from different ethnic groups of patients and practitioners. In our original work with CQI we found that patients consulting with a doctor in a language other than English had generally shorter consultations, but scored enablement more highly.8 However, despite these differences, we were able to show a strong correlation between doctors' ranking for CQI when their ‘English speaking’ patients were compared with their ‘non-English speaking’ patients.
Given the level of overlap between the boundary values for the three original components of CQI-2 in the UK and west of Scotland studies, we recommend using those from the UK study as they are based on the larger cohort. Given the possibility that ‘consultation length’ might be liable to inexact recording (this is the one component recorded by the doctor, and capable of being recorded incorrectly (or even being ‘gamed’), a further version of the CQI-2 omitting the ‘mean consultation length’ item was compared to the full CQI-2 as calculated above; ‘r’ was found to be 0.95 for the UK ranges and 0.93 for the west of Scotland ranges. It would therefore be possible to omit this component altogether either to make the instrument easier to administer or where inaccurate recording is suspected, although at the present time we believe this would reduce the strength of the complete instrument.
We are also aware of one large unpublished study in the west of Scotland where the enablement scores recorded were significantly higher than those we found in this or any of our previous studies. We recommend that if mean values for any component of any CQI-2 study depart materially from those shown in Table 4, the sextiles being used should be re-banded appropriately.
We are grateful to Margaret Maxwell, David Heaney, Graham Watt, and Ann Louise Kinmonth for their advice and help, and to all the GP, practices and patients who participated.
Additional information accompanies this article at http://www.rcgp.org.uk/Default.aspx?page=2482
Stewart W Mercer was supported by a Health Services Research Training Fellowship from the Chief Scientist Office of the Scottish Executive at the time of the original study (CZP/4/5), and is currently supported by a Primary Care Research Career Award from the same organisation
The authors have stated that there are none