|Home | About | Journals | Submit | Contact Us | Français|
The purpose of this study was to develop and validate a generic questionnaire to evaluate experiences and reported outcomes in patients who receive treatment across a range of healthcare sectors.
Mixed-methods design including focus groups, pretests and field test.
The patient questionnaire was developed in the context of a nationwide program in Germany aimed at quality improvements across the healthcare sectors.
For the field test, 589 questionnaires were distributed to patients via 47 general practices.
Descriptive item analyzes non-responder analysis and factor analysis (PCA). Retest coefficients (r) calculated by correlation of sum scores of PCA factors. Quality gaps were assessed by the proportion of responders choosing a response category defined as indicating shortcomings in quality of care.
The conceptual phase showed good content validity. Four hundred and seventy-four patients who received a range of treatment across a range of sectors were included (response rate: 80.5%). Data analysis confirmed the construct, oriented to the patient care journey with a focus on transitions between healthcare sectors. Quality gaps were assessed for the topics ‘Indication’, including shared-decision-making (6 items, 24.5–62.9%) and ‘Discharge and Transition’ (10 items; 20.7–48.2%). Retest coefficients ranged from r = 0.671 until r = 0.855 and indicated good reliability. Low ratios of item-non-response (0.8–9.3%) confirmed a high acceptance by patients.
The number of patients with complex healthcare needs is increasing. Initiatives to expand quality assurance across organizational borders and healthcare sectors are therefore urgently needed. A validated questionnaire (called PEACS 1.0) is available to measure patients' experiences across healthcare sectors with a focus on quality improvement.
Measurement and reporting of patients' experiences have become an important element of health-service evaluation worldwide . Several reports, in particular from the USA, have shown that fragmentation of modern healthcare systems has serious implications for patients and their quality of care , despite the stipulation that modern healthcare systems provide a sophisticated level of medical care, knowledge and technology. However, fragmentation between sectors also poses a challenge for patients and healthcare providers. This is most evident in the increasing complexity of managing patients living with chronic illness and cancer, where many actors are involved in the delivery of a complex chain of care across multiple sectors . In turn, this leads to particular problems at the transition points between care sectors. Patients report problems with respect to coordination and cooperation between healthcare providers, with the consequence that problems in patient safety and quality of care arise [4, 5]. The current challenge posed is to expand quality assurance to include a cross-sectoral focus . Patients' perspectives are critical to this expansion process, not only because patients are the only ones with first-hand experience of the different sectors from start to finish during their care journey, and therefore potentially offer a useful overview, but also because it is best practice to involve patients' perspectives in quality improvement initiatives [6, 7] as part of a move towards patient-centered care in modern health systems [2, 8].
In Germany, as in many other countries, quality assurance systems are typically restricted to either outpatient care or hospital care. However, a comprehensive program for quality improvement across healthcare sectors in Germany (‘Sektoruebergreifende Qualitaetssicherung im Gesundheitswesen’ or ‘SQG’), established in 2009, is focused on the complete patient care journey across sectoral borders: from the phase of diagnosis to the phase of discharge and transition continued with the outpatient follow-up care . As the SQG program is based on quality indicator development and measurement, feedback from patients via surveys provides an important method to evaluate performance of state-funded healthcare providers. To date, some topics for indicator development have included cataract surgery, cervical conization, colorectal cancer and percutaneous coronary interventions or coronary angiography .
International approaches and best practices in designing national quality improvement programs adopted multiple methods and steps for the development of patient questionnaires. One of the pioneers was the patient survey developed by the Consumer Assessment of Healthcare Providers and Systems Consortium (CAHPS) in the USA, which was used within a national quality program measuring patients' experiences of hospital care [10, 11]. The Netherlands adopted the CAHPS questionnaire as a component of their Consumer Quality Index (CQ index), a measurement to compare consumer experiences in health care for national health planning . Numerous other countries use patient surveys for public reporting or health planning at a national level, e.g. the UK (NHS/Picker Institute Europe), Canada (Canadian Community Health Survey), Norway (Norwegian Knowledge Centre for the Health Services) or Denmark (Department of Quality Measurement for Aarhus) . However, currently available instruments gathering and representing patients' views are either focused on organizational service development or they are limited to sector-specific contexts either to outpatient or inpatient care. To the authors' knowledge, no established tool is available to evaluate patients' experiences along their complete journey across healthcare sectors, which led to the development of this new instrument: PEACS 1.0 (Patients Experiences Across Care Sectors). We report the process of development and validation of this instrument, in a German context, to measure the quality of care across sectors from the patients' perspective.
The instrument was developed between June 2011 and September 2012 in collaboration between the Department of General Practice and Health Services Research at University Hospital Heidelberg and the AQUA-Institute for Applied Quality Improvement and Research in Healthcare, Goettingen.
The mixed-method study design (Fig. 1) started with a conceptual phase including a qualitative focus group study to identify a broad range of patient perspectives , although in this paper, we primarily describe the quantitative field test. The goals of the field test were (a) to assess validity and reliability in accordance with the concept of measurement properties and proposed quality criterions for health-related patient-reported outcomes composed by the COSMIN initiative [14, 15] and (b) to determine whether the items were able to measure cross-sectoral quality. Before reporting the methods and results of the main field test, relevant background information related to the conceptual phase is described.
Qualitative data on patient perspectives were gathered via focus groups with a total of 28 patients. Following a literature review, a guide for the focus groups was created. The major subjects identified were communication, care-management, shared-decision-making, patient safety, patient support and frameworks. Focus group feedback was plotted along the cross-sectoral patient care journey to develop the questionnaire construct. Thus, the construct included a collection of major and minor subjects mapping a generic patient perspective onto the quality of cross-sectoral care. A core finding from the focus groups was that discharge and transition to home, or to follow-up care, were key areas where quality deficits occurred. Based on focus group data and the ensuing construct, an item pool of 145 questions was developed. The results of the focus-group study and the construct have been published separately .
Two central concepts for measuring patients' views are rating and reporting. There is a long tradition using rating scales for global measurement of patient satisfaction [16–18]. Since the 1990s, evidence that reporting of experiences can be more helpful for quality improvement than global ratings of satisfaction has grown [19, 20]. In this study, reporting items, supplemented by rating items, and a 4-item scale for self-reporting outcome formed the conceptual basis of the questionnaire. The report questions were grouped into five composites and the rating items into three composites. The construct of the composites followed the process of cross-sectoral care including the following process phases: ‘Diagnosis’, ‘Indication’, ‘Treatment at hospital/institution’, ‘Discharge and Transition’ and ‘Outpatient/Follow-up care’. The scales were not designed as homogenous subscales. Residual categories were added to the reporting items within the meaning of ‘Was not important to me’ or ‘Does not apply to me’. For rating items, a five-point Likert scale was used ranging from ‘fully correct’ to ‘not correct at all’ also with one residual category.
Further information on Step 3 (Cognitive pretest) and Step 4 (Pretest and item reduction) of conceptual phase is summarized in Supplementary data, Appendix S1. In the following section, the main field test will be described.
The pilot version of the questionnaire was administered to patients during a 2-week period in May 2012. Patients were recruited by 47 participating primary healthcare practices, which were selected by convenience sampling from an existing network of research practices . Practice staff documented patient information from the recruited participants in a pseudonymized form.
In accordance with the SQG program and the core goals driving the development of a generic cross-sectorial questionnaire, the inclusion criteria for the recruiting patient participants were (a) involved in one of the target areas of the SQG program topics or (b) another surgery or treatment that had occurred in the previous 12 months, including both inpatient as well as outpatient settings (Table 1). In addition, patients had to be over 18 years. Patients were re-contacted 3 weeks after the first questionnaire was returned and asked to complete the questionnaire a second time for the retest results. Reminders or incentives for patients were not given.
A non-responder analysis was conducted to prevent a bias in interpreting data [22, 23]. A t-test was used for independent samples testing significance in differences of age and a cross-tabulation with chi-square tests was used for sex and treatment.
To describe sample and measurement characteristics, counts and percentages were used (Table 2). The acceptance of the instrument was based on the item non-response rate, calculated as the percentage of responders who did not provide valid responses for each item. The analysis of the residual category enables estimation of whether an item is important for the target group. Ceiling effects were also checked by evaluating the proportion of patients using the highest response category. Eight-five percent was applied, which is a common limit for ceiling effects , unless conceptual reasons were found. To measure quality, the particular response categories indicating low quality were defined for each item. We give a translated example for a reporting item with the indications of each response category:
Did the doctor talk to you about risks and possible complications of the treatment? (Item 14)
In the case of rating items with a five-point Likert scale the answer ‘fully correct’ specifies the ceiling effect. The options ‘partly’, ‘not correct’ and ‘not correct at all’ were defined as low quality. The sixth category ‘I don't know’ was designed as a residual category. We calculate a quality gap measure per item as proportion of patient responses indicating low quality.
To examine the hidden domains of the questionnaire, a factor analysis (Principal Component Analyses or PCA) was conducted as a standard method . We used the Kaiser–Meyer–Olkin measure (KMO) of sampling adequacy and the Bartlett test of sphericity to determine appropriateness of PCA. Two PCAs tests were run for reporting as well as rating items. The residual categories were recoded into missing values for PCA. Missing values were deleted pairwise. Oblique rotation (Promax) was used for reporting and Varimax for rating items.
Sector transitions are the critical phases for quality problems in a typical care process across different sectors. Therefore, the questionnaire included items aiming to evaluate shared decision-making (Indication scale) and transition to follow-up care (Discharge and Transition scale). To assess criterion validity for these topics, the validated German version of the 9-item Shared Decision Making Questionnaire (SDM-Q-9)  and a self-translated version of the 3-item short form of Care-Transition-Measurement Questionnaire (CTM)  were included. The SDM-Q-9 measures the degree of SDM focusing the decision about different treatment options. On the other hand, the PEACS assesses SDM as information and involvement process independent from another available options. The CTM measured the overall quality of care transition assessing respect of patient's preferences (Item 1), understanding responsibility for self-managing health (Item 2) and understanding of medication purposes (Item 3). In contrast to the CTM-3, the items of PEACS place emphasis on communication of information in a more detailed way. Sum scores and correlated means tests comparing the external scale with our corresponding scales were also run.
The test–retest method was applied to evaluate the validity and reproducibility of the survey. The Cohen's Kappa coefficient quantifies the intra-rater agreement per item. In consideration of the paradox characteristics of the coefficient , we decided for the weighted kappa kw  with a weight of 0.75. We interpreted the coefficient in accordance with Altman and defined a limit of kW = 0.6 . If a value was <0.6 we reviewed the proportion of overall agreement and the agreement matrix. In the case of inconsistency, the item was removed. For measuring test–retest reliability at the level of construct, we calculated a summary index for each factor and correlated the means. Values >0.7 are usually regarded as confirming reliability and this was our cut-off point .
Data analyses were made using IBM SPSS Statistics version 20 except for the weighted kappa.
Ethical approval for this study was received from the Ethics Committee of the Medical Faculty at the University of Heidelberg (Reference: S-586/2011).
We received 492 responses and excluded 16 forms because of missing or invalid data. Two patients refused participation (final response rate: 474/589, 80.5%). For the retest phase, 32 participants were lost to the study due to absent or false address data or a decision to withdraw. Four hundred and forty-two participants were included for the retest and from this remaining pool, 342 questionnaires were returned (77.3%). We reviewed the questionnaire for sufficient data and plausibility (concordance of treatment data between t0 and t1) and excluded 38 further questionnaires (final response rate retest: 304/442, 68.8%) (Fig. 2).
Table 1 provides socio-demographic information and other characteristics of the patient sample. Approximately half of participants were female (51.9%). The mean age was 63.2 years (SD 14.5). Health care for most participants was funded by the statutory German national health insurance scheme (93.2%).
The mean age of non-responders was 60.6 (SD 17.5) years and 49.1% (n = 53) were female. The distribution of treatment groups was also similar in both samples. Responders did not significantly differ from non-responding patients in these relevant characteristics: age (P = 0.11), sex (P = 0.59) and treatment groups (P = 0.42).
Table 2 provides descriptive information on the items of PEACS 1.0.
The proportion of missing values at item level ranged from 0.8 to 9.3%. Item 45 was developed as a filter question. Overall, 47 of 57 items had a non-response rate <5%. Ceiling effects ranged from 37 to 98%. Item 29 measured a rare event, which may have serious consequences for patients. Some items exhibit a high ratio of participants choosing the residual category, e.g. item 30 (80.8%, n = 366) or item 46 (93.8%, n = 406). The proportion of answers indicating quality gaps by reporting items ranged from 2.4 to 58.2%. A concentration of quality gaps at the phases of the continuum of care Indication and Discharge and Transition was observed. Quality gaps indicated by rating items were relatively moderate (8.3–23.5%) in comparison with reporting items, with an enhanced concentration on self-reported Outcome.
Some reporting items were excluded for PCA step-by-step analysis because the amount of residual category (missing values for factor analysis) was high or Measure of Sample Adequacy (MSA) was low (e.g. Item 27; MSA = 0.08). Items were also excluded if they loaded on more than one factor (e.g. Item 51). The final PCA included 28 reporting items with different sample sizes each (mean 370, SD 87, min 128, max 462 participants). The appropriateness of the items was very good (KMO: 0.891, Bartlett significance: 0.000). Based on the Kaiser criterion (Eigenvalue>1) items are categorized in six different factors explaining 64.0% of the total variance between the 28 variables. Additionally, we conducted a second PCA and included all rating items with very good appropriateness (KMO 0.83, Bartlett significance 0.000) resulting. Items were categorized into three different factors explaining 69.8% of the total variance between the 12 variables. Factor loadings of PCAs are shown in Supplementary data, Appendix S2.
We found a significant and high correlation of the factor scale ‘Shared decision-making at indication’ and the SDM-Q-9 scale (r = 0.814, P < 0.001). Between the factor scale ‘Information at discharge and follow-up’ and the CTM-3 we assessed a significant and moderate correlation (r = 0.511, P < 0.001).
All values for assessing reliability of PEACS 1.0 items are shown in Supplementary data, Appendix S3. Thirty-two items offered a good weighted kappa (kw > 0.6). Eight items had a lower but acceptable weighted kappa (0.52–0.59) with good proportions of overall agreement (po) and concurrently no abnormal agreement matrix. Three items (23, 27 and 30) were conspicuous. Because of the importance of the item content, we decided to keep them into the instrument for a broader field test.
Test–retest correlations based on construct level are shown in Table 3. The retest coefficients indicated good reliability (r > 0.7), except for factor IIX with a moderate value (r = 0.671).
Overall, 16 items of the pilot version of the questionnaire were excluded to develop PEACS 1.0 (Supplementary data, Appendix S4).
There is broad political and academic consensus supporting the measurement and improvement of quality across sectors, but the existing borders between the different sectors make it challenging . In this study, we developed and validated a generic German questionnaire (PEACS 1.0) to evaluate patients' experiences and detect potential quality gaps along the complete journey of care. The stepwise development process supported good content validity. The high response rate and the very low item-non-response indicate a very high acceptance by patients. Reliability was considered to be good using the test–retest procedure. The moderate criterion correlation with the CTM-3 could be caused by the differences in the item contents. CTM-3 focused on the clinical level, while the factor scale ‘information at discharge and follow-up’ addresses clinical as well as outpatient levels. This example indicates one of the challenges in developing an appropriate instrument to measure care across sectors: methodological standards and good practices are still in the early development stage.
Including patients' perspectives is fundamental to quality improvement in health care [2, 7]. Particularly in fragmented healthcare systems, patients are the only ones with first-hand experience of the different sectors from start to finish during their care journey. Therefore, an important part of our instrument development strategy included the involvement technique to identify patients' perspective and their preferences, using focus groups . From the patients' perspective, processes of communication, coordination and transition were defined as relevant quality dimensions, sometimes without reference to the outcome of these processes. These results are accompanied by further evidence that patients wish to be informed and involved in care processes, whether on an individual basis to a greater or lesser extent . We evaluated patients' feedback as an important source of information, even in the context of complex issues like patient safety and infection control , and even though the patient role in quality assurance remains controversial . Several studies have shown, however, that patients are able to identify important care-related issues .
Patient-reported experiences measured by reporting items demonstrated a wide range of quality gaps. This comes with the limitation that the ratios of quality gaps hinged on the number of patients who chose a response category, defined as problem, in relation to all patients who chose a response category that was different from the residual category. The advantage of offering a residual category is to exclude patients who are not concerned with a particular item as well as to evaluate the importance of item content. The disadvantage is the reduction of valid responses for data analysis . Responses of item 30 (‘Culture of dealing with errors’), for example, indicated that a high proportion of participants chose the residual category ‘An error did not occur’. The information offered by the residual categories is worth being analyzed on its own: in the case of item 30, the first result is whether an error occurred or not (based on the residual category), and the second result is how patients experienced the situation was dealt with if an error occurred. In the case of Item 46 (‘Written Information for home care providers’ with the residual category ‘Home care provider were not needed after transition’), for example, we measure a rare event that has a high potential for detecting quality gaps affecting a small-sized target group. We presume that this will be important in the case of a broader sample. Due to these analytical options and because of its importance for cross-sectoral quality of care, we decided to keep items in the questionnaire despite high residual categories. The decision for keeping or removing of items cannot only be determined in light of measurement properties. Overall, the ratios of quality gaps show that PEACS 1.0 is able to evaluate potential for quality improvement, even with the limitation of small samples for some items.
The developed PEACS 1.0 contains 59 items, based on the typical care process and focusing on transition processes. Domains in this version are ‘Preliminary care’, ‘Shared decision-making at indication’, ‘Patient education and information’, ‘Nursing staff’, ‘Accessible physicians’, ‘Pain therapy’, ‘Institutional treatment and transition’, ‘Information at discharge and follow-up’ and self-reported ‘Outcome’. Every item in these domains indicated a quality target with possible problems. Because the scales are not designed homogenously, the decision to choose or exclude an item had to be decided with relevance to quality assurance needs based on item content. To communicate the results of the assessment, e.g. to providers or to the public, it seemed to be meaningful to aggregate the item results as problem scores to topics. The PCA and statistical analysis identified relevant items to be aggregated in respective dimensions. This approach supported the initial construct of the questionnaire and confirmed the general construct of the patient care journey. However, further research for construct validation is necessary before reporting aggregated measures in a national program.
Some items had to be excluded for PCA because of statistical requirements, although they are important for quality assurance due to content, based on the conceptual phase. The single items 29, 30, 31, 32 (‘Medication error’, ‘Culture of dealing with error’, ‘Facility cleanliness’, ‘Adherence to hand hygiene’) relate to patient safety and the single items 43, 44, 45, 46 (‘Support by institution on transition to home’, ‘Involvement of relatives’, ‘Support in organization of follow-up rehabilitation’, ‘Written information for home care providers’) relate to important transition elements. These items refer to rare or adverse events (reason for high rate of residual category = missing value). They are very important for patients as evaluated in the conceptual phase, which is why we decided to keep them in the questionnaire even if they are not suitable for PCA.
The result of the PCA, shown in Supplementary data, Appendix S2, includes 40 items. We recommended these 40 items as a minimal generic set of the bank of items of PEACS 1.0 to assess patients' experiences of care across sectors. This 40-item version of PEACS questionnaire provides a statistically well-grounded assessment of quality gaps with a focus on the transition between sectors.
In addition to the PCA-based 40 item version, PEACS 1.0 includes 19 further items, e.g. filter items and items they are not suitable for PCA because of the high rate of residual category, but with important content from patient perspectives and a high potential of discriminative power. These items representing important patient experiences analyzed in focus groups have a high potential to discriminate healthcare provider based on quality measure, but with the limitation that the questionnaire PEACS 1.0 has to be tested on a bigger generic sample for final optimizing before applying it in the public SQG program. With this in mind, our study included patients with the index disease defined by the legislation topics of the SQG program. However, as indicated by Table 1, the study included a high proportion (47%) of patients with diseases beyond those SQG topics. We therefore consider the questionnaire to be applicable for a broad cross-section of patients.
We constructed the PEACS 1.0 with an emphasis on quality of care processes. It must be stressed that the questionnaire was developed as a generic modular tool that is intended to be supplemented by specific topic-related items in patient surveys for the topics of the SQG program. The appropriateness of the instrument has to be evaluated in the light of this context.
Our study showed that it is possible to develop and to validate a survey instrument to detect quality gaps in fragmented health care by evaluating patients' experiences across different healthcare sectors. For benchmarking purposes and monitoring performance of healthcare providers over a period of time, it is necessary to validate newly developed questionnaires and test the discriminative power at the level of provider cluster. Indication-specific scales and disease-specific outcome measures will be added in a next step.
J.S. is a director and shareholder of the AQUA-Institute which is responsible for the development of the SQG program in contract with the Federal Joint Committee. A.K. is an employee of the AQUA-Institute. The department of General Practice and Health Services Research (S.N., S.L., D.O., K.G., K.B.), the Institute of Medical Biometry and Informatics (J.R.) and the Radboud University Nijmegen (M.W.) are scientific partners in the SQG program.
The SQG program and the work of the AQUA-Institute in this field are funded by the Federal Joint Committee (G-BA) under the German Social Code Book V, §137a. Funding to pay the Open Access publication charges for this article was provided by the AQUA Institute.
The authors thank all patients and general practices for participation in this study. They would also like to thank Sarah Berger for proofreading the manuscript and Dr. Katja Hermann for helpful methodological comments. Advice given by Prof. Eva-Maria Bitzer has been a great help in factor analysis. The authors would also like to thank all members of the various working groups for their fruitful comments during each development stage of the questionnaire.