This systematic review set to provide insight into the state of quality assessment efforts for public mental health care (PMHC) services and systems around the world, the characteristics of performance indicators (PI) proposed by these projects, and the evidence on feasibility, data reliability and validity of PI for PMHC.
The systematic inventory of literature resulted in the inclusion of 106 publications that specified PI, sets of PI, or performance frameworks for the development of PI. 1480 unique PI for PMHC were proposed covering a wide variety of care domains and quality dimensions. Establishment of aspects of feasibility and content validity of PI seem to be an integral part in indicator development processes. Through review of literature, expert consultation, or stakeholder consensus almost all publications show that the PI under development can be implemented, and measure a meaningful aspect of health care quality. We found that for almost a quarter of the PI no data source was specified in the publication. Most of the remaining PI (53%) are based on administrative data. Eighteen publications, 17% of the total, reported on the assessment of criterion validity of PI for PMHC. In these publications, the criterion validity of 56 PI was assessed, less than 4% of the total. This percentage is even lower when we take into account that several studies assessed similar PI.
The majority of the publications focused on PMHC systems and services in the United States and over 80% of the publications were concerned with PMHC systems in English-speaking nations. This could be explained by the organizational structure of the U.S. health care provision and payment system, which is primarily operated by private sector organizations, has traditionally put a relatively large emphasis on transparency and accountability of costs and performance of health care providers. The introduction of managed care techniques and organizations in U.S. mental health care in the late 90's has spurred the development of quality assurance instruments even further. This resulted in a plethora of PI to provide local, state, and federal administrators with information for PMHC policy and -funding purposes as well as to guide quality improvement efforts. The skewness of the distribution of publications towards PMHC in English-speaking nations is possibly exaggerated by including only English and Dutch publications in the review. As performance measurement programs and efforts are predominantly focused on PMHC within a nation, they are likely to be published in the language of that nation. However, the structure of the healthcare system may have a profound effect on the efforts put into performance indicator research.
More than 40% of the PI aims to measure the effectiveness or clinical focus of PMHC However, the remaining PI measure a wide variety of performance dimensions. This could indicate a lack of consensus on the definition of PMHC quality between nations and even within nations. The diversity of performance dimensions in PI is also indicative of (local) political interests in PMHC. When designing PI for PMHC systems or services, developers often consider the local political climate and interests, particularly as the policymakers and politicians are the main stakeholders and primary users of the PI.
Only a relatively small number of PI combine data from multiple sources. Although the PI aim to measure performance on a system level of care, data systems of service providers are probably still 'stand-alone'. Issues such as privacy, absence of unique identifiers, data ownership, and lack of standard data formats could prevent data systems from integrating at the same rate as the service provision.
The hazards and risks of inadequate data reliability in terms of completeness and accuracy, for the usability and feasibility of PI based on administrative data sources, have been recognized by a number of authors and leading organizations in the field of performance measurement [e.g. [71
]]. It is therefore surprising we only found two publications that explicitly assessed the reliability of administrative databases for PI in PMHC. It seems developers assume data reliability, at least availability and completeness, based on expert opinion and stakeholder consultation. However, providers collecting the data often have interests in the conclusions drawn from PI and when they're asked by external organizations to extract data from their client-registration systems, data reliability cannot be assumed. Especially when services or systems benefit from better performance, or the purpose of the PI is unclear to the unit (i.e. person or department) responsible for collecting the data, data reliability should be evaluated.
The consultation of experts and stakeholders not only proves to be a widely accepted method to ensure face validity and contribute to the content validity of PI, but seems an important tool to create support in the field to use the PI for accountability and transparency purposes by (external) accrediting organizations and PMHC financing bodies, or (internal) quality monitoring and improvement by PMHC care providers as well.
For only a fraction of the 1480 unique PI included in this inventory the relationship with criteria of quality has been assessed. An explanation for this finding is that criterion validity research is time-consuming and costly, and the added value is not always apparent to stakeholders. The performance on both the indicator and the criterion of a sufficiently large research group that is representative for the client population needs to be recorded in order to reliably assess the extent of the correspondence between indicator and criterion. When consensus between stakeholders on the usefulness and feasibility of PI has been procured, indicator developing organizations often do not have the funds or the incentives to further study the validity of the PI and prioritize the utilization of the PI to increase transparency or accountability of the PMHC system. Understandably, these stakeholders have more interest in the information generated with PI than in 'fundamental' characteristics of PI themselves, such as criterion validity.
While the majority of the associations between the PI and the criteria studied in the included publications are statistically significant and in the expected direction, studies report mixed and in some cases even contradictory results in several PI. Measures of satisfaction, readmission, certification status, medication dosage adequacy, length of stay, and appropriateness of screening are reported to have no significant association with one or more criteria of PMHC quality. However, other studies do report significant associations of some of the same measures with other criteria, or even use these measures as criteria to validate others. The scientific and practical utility of criterion validation depends as much on the measurement of the criterion as it does on the validity of the indicator [15
]. For many concepts related to PMHC quality, valid criteria are simply not available.