|Home | About | Journals | Submit | Contact Us | Français|
The Michigan Hand Questionnaire (MHQ) is one of the most widely utilized, hand-specific surveys that measures health status relevant to patients with acute and chronic hand disorders. However, item redundancy exists in the original version, and an abbreviated survey could minimize responder burden and offer broader applicability.
Patients (n=422) with 4 specific hand conditions completed the MHQ at two time periods: rheumatoid arthritis (RA) (n=162), thumb carpometacarpal osteoarthritis (CMC) (n=31), carpal tunnel syndrome (CTS) (n=97), and distal radius fracture (DRF) (n=132). Correlation analysis identified 2 items from each of 6 domains (function, activities of daily living, work, pain, aesthetics, and satisfaction). A brief-MHQ score was calculated as the sum of the responses to the 12 items. Psychometric analysis was performed to describe the reliability, validity, and responsiveness of the brief MHQ.
The brief-MHQ includes 12 items that were highly correlated with the summary MHQ score (r= 0.99, p<0.001). The brief MHQ scores were highly correlated between the two time periods (r =0.78, p<0.001), and by disease type. Responsiveness of the brief MHQ was high for all diseases, and similar to that of the original MHQ.
The 12-item brief MHQ is an efficient and versatile outcomes instrument specific to hand disability that retains the psychometric properties of the original MHQ. The brief MHQ is an important tool to measure patient outcomes and the quality of care in hand surgery.
In the United States, health care quality is variable, difficult to measure, and not well correlated with health care expenditure. In response, health services research has focused on identifying ways to measure health care quality efficiently and accurately.(1) Efforts to measure health care quality range widely, from examining 30-day mortality or hospital readmission rates, to implementing re-certification standards for physicians in practice. However, hand disability presents a unique challenge for health services researchers in that traditional indicators, such as mortality or perioperative processes of care, are not sensitive enough to distinguish variations in patient outcomes. (2) However, hand function has a profound effect on a patient’s quality of life. Therefore, measures of patient satisfaction and disability are essential quality indicators for clinicians, researchers, and health care policy-makers interested in the quality of care of upper extremity disability.
Over 25 years ago, the Socioeconomic Committee of the American Society for Surgery of the Hand advocated the development of a comprehensive instrument through rigorous methodology to measure patient outcomes following hand surgery.(3) In 1998, the Michigan Hand Outcomes (MHQ) was introduced as a hand-specific outcomes instrument that can effectively measure aspects of health status that are relevant to patients with acute and chronic hand disorders. (4) Today, the MHQ is a widely utilized, self-administered instruments to measure upper extremity disability and patient outcomes across a broad range of hand-related diseases. The MHQ has been translated into 8 different languages and used worldwide for a range of acute and chronic hand conditions including, trauma, nerve compression syndromes, arthritis, and inflammatory conditions. (5–19) The MHQ comprehensively gathers information on hand function, the ability to complete daily and occupational activities, patient satisfaction, pain, and aesthetic hand appearance. It is the only questionnaire currently in use that adjusts for hand dominance, and can distinguish the differences in disability between both hands. (20)
The MHQ was developed based on strict psychometric testing and has been extensively validated in field studies in the US and around the world. In its current form, the MHQ includes 37 items, and takes approximately 15 minutes to complete. Although it has been widely used for a variety of hand conditions, previous reliability studies have demonstrated redundancy in the items included in the original survey, which limits its application for larger studies.(4, 21) Studies of reliability measures, such as Cronbach’s alpha values, have demonstrated that higher values of Cronbach’s alpha greater than 0.9 indicate item redundancy and multiple dimensions in a given scale rather than the true reliability of a scale, and should be avoided.(22, 23) A shortened version of the MHQ is a more attractive research instrument for population studies because it is more time-efficient, reduces responder burden, and can therefore minimize missing data. Previous research demonstrates that longer surveys yield lower response rates and poorer data quality or incomplete data.(24, 25) While there is no defined optimal number of survey items, clearly longer surveys increase responder burden and the cost of survey production, distribution and coding. (24, 26–28)
It is challenging to shorten a survey while retaining its reliability, validity and sensitivity. Shortened versions of the SF-36 and DASH have been developed and tested.(24, 29, 30) Using the experience from these shortened questionnaires, we created an abbreviated version of the MHQ based on rigorous methodology in order to preserve the essential properties of the original instrument. To develop a brief version of the MHQ, we used data gathered prospectively from patients with four distinct hand conditions: distal radius fractures, carpal tunnel syndrome, rheumatoid arthritis, and thumb carpometacarpal (CMC) arthritis. We hypothesize that the shortened version of the MHQ will be as valid and reliable as the original version, and yet more practical for studying health care quality in multicenter or nationwide studies.
We gathered data prospectively from 422 patients with four specific hand conditions: rheumatoid arthritis, thumb CMC osteoarthritis, carpal tunnel syndrome, and distal radius fracture. With the exception of patients with distal radius fractures, patients for the elective procedures were evaluated preoperatively and at a designated time period postoperatively.
In this dataset, 162 patients with rheumatoid arthritis (RA) were included in a larger, prospective study supported by the National Institutes of Health regarding the use of silicone metacarpophalangeal arthroplasty (SMPA) for joint deformities due to RA. The details of this study and the data collection have been described elsewhere.(31, 32) RA patients completed the MHQ at the time of enrollment and at 6 months follow-up. Additionally, 97 patients treated at the University of Michigan for carpal tunnel syndrome (CTS) completed the MHQ as part of a study designed to evaluate the complications associated with minimal-incisioncarpal tunnel release. Patients were included in the study based on a diagnosis of carpal tunnel syndrome from clinical presentation (history of hand dysthestheia along the distribution of the median nerve and a positive Phalen’s flexion test finding and/or a positive Tinel’s sign) and electrodiagnostic study confirmation of CTS. Patient completed the MHQ prior to carpal tunnel release, and at 6 months postoperatively.(18, 33) Data were also collected from 132 distal radius fracture (DRF) patients who underwent operative fixation using the volar locking plating system (VLPS). As it was not possible to survey patients prior to their injury, responses taken at 3 months following surgery were used as baseline, and responses taken at 6 months following surgery were used as follow-up. (34) Finally, 31 patients with thumb carpometacarpal (CMC) arthritis completed the MHQ as part of a larger study to determine patient outcomes following trapziectomy and modified abductor pollicis longus suspension arthroplasty. (12) Patients completed the MHQ at their preoperative clinic visit and again at 3 months postoperatively. All study protocols were approved by the institutional review boards at the University of Michigan.
Additionally, the DRF, RA, and CMC arthrtisis patients were assessed at baseline and at each clinic visit with objective measures of functioning, specifically grip strength in kilograms, key or lateral pinch strength in kilograms, and by Jebsen-Taylor Test score.(35)The Jebsen-Taylor test is a battery of tasks for which the subject’s speed of completion is calculated, and consists of the following components: (1) writing a short sentence; (2) turning over 3-by 5-in. cards; (3) picking up small objects and placing them in a container; (4) stacking checkers; (5) simulated eating; (6) moving large, empty cans; and (7) moving large, weighted cans. The time required in seconds to complete each task was recorded for the subjects’ dominant and nondominant hands. A subset of patients completed the Arthritis Impact Measurement Scales 2 questionnaire, a 45- item, self-administered outcomes tool designed to assess health status in patients with inflammatory arthritis and osteoarthritis.(36) The AIMS2 is designed to provide a global, self-reported assessment of patient health status, and yields information in 4 domains including physical functioning, affect, symptom, and social interaction. Scores range from 1–10, with lower scores reflecting better health.
All subjects completed the original, 37-item version of the MHQ (see Appendix 1). Each item on the MHQ allows the subject to respond to the question along a Likert like scale ranging from 1 to 5. Responses are then summed to yield a domain score for each of the 6 domains, and respondents must respond to 50% or more of the items within the scale in order to consider their responses as sufficient to generate a domain score. Each domain scores are transformed to range from 0 to 100, and higher scores indicate better performance for all scales, with the exception of the pain scale. The MHQ also yields a summary score, which is calculated by averaging the scores for each domain, after reversing the pain score.
Item reduction was performed using a concept-retention technique in order to identify those items of the original MHQ that would be retained in the brief version. (29) Briefly, this technique allows the selection of items based on their clinical relevance rather than on statistical testing alone. This technique is ideal in that it allows investigators to retain those items that were clinically relevant, regardless of their statistical value. For example, with the pain scale, both “Describe the pain in your hand(s)/wrist(s)” and “How often did the pain in your hand(s)/wrists(s) make you unhappy” both had similar correlation coefficients at 0.18. However, all authors reviewed these items to choose the most clinically relevant item, which was included in the final instrument. Twelve items were selected, with two items chosen from each scale. This allows for each scale from the original survey to be represented in the brief version. The two items from each scale that were most correlated with the original MHQ score were retained. Items were selected from each domain by the authors with the goal to retain the all concepts of the original framework of the MHQ (Figure 1). We then confirmed those items that were retained by determining the correlation of each item of the MHQ with the summary MHQ score using correlation analysis. Because the original MHQ requires subjects to respond to items separately for the right and left hand, except for items in work domain, responses of the two hands were averaged for each item prior to item reduction for all patients except those with distal radius fractures. The design of the brief-MHQ will not distinguish between laterality of hand symptoms to make the brief-MHQ as economical as possible in content and response time.
We generated a summary score from the items in the brief-MHQ by averaging the responses of the final 12 items. The minimum possible raw summary score for the brief-MHQ score is 1 and the maximum score is 5. The brief-MHQ score was normalized to a scale of 0 to 100 that is similar to the original MHQ. Summary scores were not calculated if any of the items along the brief MHQ were missing. The brevity of the brief-MHQ requires complete response to all 12 items. Therefore, brief MHQ scores were not calculated if any data were missing in the included items. For the full MHQ, scores were only calculated if at least 3 out of the 6 scales were complete. For the function, ADL for individual hands, pain and aesthetic scales, respondents must have no more than 2 missing values to calculate scores for these scales. The satisfaction and ADL using both hands scales required 3 or fewer items missing. The final survey and scoring algorithm is included in Appendix 2.
Reliability is the ability of a survey instrument to precisely measure a concept. (37) We measured the reliability of the brief- MHQ among RA patients who did not undergo SMPA because their responses would be expected to remain similar over a relatively short period. The brief-MHQ summary score and the original MHQ summary score were compared between baseline and at 6 month follow-up. The degree of correlation between the measurements of the two time periods was obtained using Spearman’s correlation coefficients. Paired t-tests were used to determine the average difference in the summary score between these two time periods. A mean difference of 0 indicates perfect test-retest reliability.
As an additional measure of reliability, we calculated the values of the intraclass correlation coefficient (ICC), which is an additional measure of agreement of estimates between paired data. The ideal values for the ICC should range between 1 (perfect correlation) and −1/(k−1) (low correlation) where k is the number of subjects. (38)
Validity is defined as the ability of a survey instrument to accurately measure a concept of interest. Important types of validity to consider in survey development include content validity, criterion validity, and construct validity. Content validity refers to the extent to which the instrument appears capable of measuring the desired outcome. The original MHQ was developed with strict attention to psychometric principles, and has been validated in a variety of acute and chronic hand conditions, and translated into several languages. Therefore, we would expect that the brief-MHQ would also have similar content validity.(12, 14, 15, 39, 40) Criterion validity describes the extent to which a survey instrument compares with the accepted reference standard. Currently, there is no established standard by which to measure health outcomes related to hand dysfunction across a wide range of acute and chronic hand conditions. Therefore, we rely on construct validity to ensure that the brief-MHQ accurately measures hand disability and function. Construct validity describes the extent to which survey responses or scores correspond to expected values. If the brief MHQ is valid, we hypothesize that the brief MHQ scores and the original MHQ scores should be similar within disease groups. To analyze this, we used multiple linear regression to compare the mean brief-MHQ scores, adjusted for clinical and demographic characteristics, by disease type. Furthermore, we compared the brief and full MHQ scores with objective measures of hand function (grip and pinch strength, and Jebsen-Taylor test score) and subjective measures of hand function (AIMS2 scores) as an additional measure of validity.
Responsiveness is the ability of a survey instrument to detect changes in an outcome of interest over time.(41) To determine the responsiveness of the brief-MHQ, we compared the summary brief-MHQ score at baseline and at follow-up after a surgical intervention for each of the disease groups using paired t-tests. In order to compare the change in brief-MHQ scores over time in a standardized fashion, we calculated the standardized response mean (SRM), calculated as the change in means at follow-up time from baseline divided by the standard deviation of the change-scores, for each disease types. We expect that a sensitive survey instrument should have a correspondingly high SRM.(42) Using Cohen’s effect size definition, we can interpret an SRM of 0.2 as a small effect size, 0.5 as a medium effect size, and 0.8 as a large effect size. (43) As an additional point of comparison, the responsiveness of the brief MHQ was compared with that of the original MHQ, grip strength, key pinch strength, and Jebsen-Taylor testfor those patients who completed these measures.
Descriptive statistics were used to describe the characteristics of the study sample. Linear regression was used to identify the survey items to be retained in the final version of the brief-MHQ, and to describe the correlation of these items to the summary score of the original MHQ. Reliability testing was performed using Spearman’s correlation coefficients and paired t-tests as described above. Validity testing was performed by using linear regression to determine the brief-MHQ summary score for each disease type, adjusted for age, gender, and education level. Responsiveness of the brief-MHQ was determined using paired t-tests and calculating the SRM as described above. Multivariate linear regression was used to generate SRM measures controlling for disease type. Statistical significance was set at an alpha level of 0.05. All analyses were conducted using Stata 10.1. (Statacorps, College Station, Texas).
The demographic characteristics of the study sample are detailed in Table 1. This study sample consisted of 422 patients, of whom 132 patients had suffered distal radius fractures, 97 patients had CTS, 162 patients had RA, and 31 patients with CMC arthritis. The average age of the patient sample was approximately 55 years, and 69.7% were women. Approximately 33% had a college education or higher.
The results of the item reduction of the MHQ are described in Table 2. In all of the domains, except satisfaction, the two survey items for each domain that were most strongly associated with the summary MHQ score were retained in the final version of the brief-MHQ for a total of 12 items. Within the function domain, the following two items “Overall, how well did your hand work?” (R2=0.41) and “How was the sensation in your hand?” (R2=0.21) were most correlated with the summary MHQ, and were retained in the brief-MHQ survey. Within the ADL domain, two items “How difficult was it for you to hold a frying pan?” (R2=0.16) and “How difficult was it for you to button a shirt/blouse?” (R2=0.17) were most strongly correlated with the summary MHQ score, and were retained in the brief-MHQ. The items within the work domain that were most strongly correlated with the summary MHQ score included “How often were you unable to do your work in the last week because of your hands/wrists?” (R2=0.25) and “How often did you take longer to do tasks in your work because of problems with your hands/wrists?” (R2=0.34). The items within the pain domain that were selected for the brief-MHQ included “Describe the pain in your hands/wrists” (R2=0.18) and “How often did the pain in your hands/wrists interfere with your daily activities?” (R2=0.39). Within the aesthetic domain, two items “I am satisfied with the look of my hands” (R2=0.33) and “The appearance of my hand interferes with my normal daily activities” (R2=0.32) were most strongly correlated with the summary MHQ score, and were retained in the brief-MHQ. Finally, the two items related to satisfaction that were retained included “Satisfaction with the motion of your fingers” (R2=0.14) and “Satisfaction with the motion of your wrist” (R2=0.19). These items were selected for inclusion in the brief-MHQ over others with higher correlation coefficients because these items contained concepts that were not represented in other portions of the survey. After these 12 items were selected for inclusion in the final brief-MHQ survey, regression analysis revealed that these survey items explained 97.8% of the variance of the original summary MHQ scores.
Table 3 details the reliability of the brief-MHQ. The reliability of the brief MHQ was examined by measuring the test-retest correlation over a 6 month period among a subset of 68 rheumatoid arthritis patients who did not undergo surgical intervention and had both baseline and 6 months measurements available. The correlation and ICC values between the brief-MHQ scores between each time period was high (r=0.78, rI=0.91), and the mean difference between the two scores was not statistically significant (0.22, p=0.87). Similar findings were noted for the original MHQ scores. This indicates excellent test-retest reliability of the brief MHQ in this subset of patients.
Figure 2 shows the adjusted means of the brief-MHQ summary score and the original MHQ summary score, stratified by disease type and adjusted for age, gender, and education. We hypothesize that the relative hand health status by the different disease types shown using the original MHQ will also be shown using the brief-MHQ scores. We observed that patients with DRF had the highest summary MHQ score (77.8 ± 1.60), and patients with RA had the lowest summary MHQ scores (51.7 ± 1.38). Similarly, patients with DRF had the highest brief MHQ score (77.8 ± 1.42) and patient with RA had the lowest brief MHQ scores (47.6 ± 1.34). Of note, DRF patients were surveyed postoperatively, and have significantly better hand outcomes than each of the other three conditions.
Table 4 details the responsiveness of the brief-MHQ to clinical change among patients who underwent surgical intervention by disease type. Overall, the responsiveness of the brief MHQ was high for all disease types, even in the DRF patients whose measurements reflect the improvements between 3 and 6 months post VLPS, and similar to that of the original MHQ. Responsiveness was highest among RA patients who have undergone SMPA, and was similar for the brief-MHQ (SRM=1.28) and the original MHQ (SRM=1.36).
Table 5 compares the changes in scores between the two time periods and responsiveness to clinical change for the brief MHQ, the original MHQ, grip and pinch strength, and Jebsen-Taylor test score. All analyses were adjusted for disease type, and CTS patients were admitted from this portion of the analysis as they did not complete these parameters. Responsiveness to clinical change was highest for the brief MHQ (SRM=4.15), followed by the original MHQ (SRM=3.30), Jebsen-Taylor test score (SRM=1.26), and pinch strength (SRM=1.60). Responsiveness was lowest for grip strength (SRM=0.36).
Figure 3 describes the correlation between the brief MHQ and objective measures of hand function, specifically grip and pinch strength, and Jebsen-Taylor test score. The brief MHQ was moderately correlated with each of these measures, after adjusting for disease type, age, gender, and education level. The correlation values were highest for grip strength (r=0.38), followed by pinch strength (r=0.35), and Jebsen-Taylor test (r=0.35). Similar trends were identified in the correlation of the full MHQ with these parameters.
Figure 4 describes the correlation between the brief MHQ and patient-reported hand function among a subset of the patient sample. Rheumatoid arthritis patients completed the AIMS2 survey, and responses to each domain were compared with brief MHQ score, after controlling for age, gender, and education level. The brief MHQ score was most highly correlated with function (r=0.68) and least correlated with social interaction (r=0.15). Brief MHQ score was moderately correlated with affect (r=0.36) and disease symptoms (r=0.53). Similar trends were identified in the correlation of the full MHQ with each of these domains.
Health services research relies on the ability to accurately and reliably measure patient outcomes to describe variations in health care quality. To describe patient outcomes related to upper extremity disability and surgery, surgeons need on patient-reported data regarding hand function, pain, and satisfaction that cannot be captured by objective testing. (44) The MHQ is popular because it comprehensively measures multiple aspects of hand disability and adjusts for hand dominance and injury. It includes measures of satisfaction and aesthetic appearances, which are important for patients with chronic and deforming hand conditions such as osteoarthritis and rheumatoid arthritis. (29) (45) (46–48) In this study, we have created a shortened, 12-item version of the original MHQ by eliminating certain items and selecting those items with the greatest correlation with the original MHQ. The brief MHQ was reliable on repeated testing, well correlated with the original version, and can detect clinical change among patients who undergo surgery. It was correlated with objective measures of hand function, including Jebsen-Taylor test score and pinch strength, and subjective measures of hand function, including domains of the AIMS2 instrument.
Previous research has demonstrated that longer surveys are associated with lower response rates.(49) For example, Jepson et. al. conducted a study of physicians to determine the response rate with varying of lengths of surveys. In this study, response rates declined sharply with increasing length. (50) However, reducing the number of survey items can also lead to reduction in the comprehensiveness and precision of the data that can be collected.(51) The decisions regarding these tradeoffs must be weighed against the benefit of a shorter survey, and the ultimate purpose of the instrument. Often, scales that were previously present in the original instrument are unable to be maintained in the shorter instrument. Furthermore, homogeneity of items in measuring a distinct concept is traded for heterogeneity of items, in order to capture data. For example, when shortening the SF-36 down to abbreviated versions such as the SF-12 and SF8, representative items from scales were able to be maintained.(24, 25) However, the selected items were heterogeneous and unique, but with a reliable variance and able to predict the outcome of interest. Therefore, the concepts measured by shorter surveys are assessed with less precision and greater variance. Regardless, these differences in precision are not likely to be detrimental when used in large studies with greater than 500 subjects, as confidence intervals of estimates are primarily related to sample size. Therefore, it is important to take into account the purpose of the instrument and the scale of the study when developing and implementing shorter survey instruments in order to account for these tradeoffs.(25, 52)
Scientific investigation relies on the ability to accurately and efficiently measure phenomena in a given population. An ideal measure should measure concepts consistently with as few items as possible. Achieving this ideal is challenging in areas where concepts such as pain and satisfaction are ambiguous and dynamic. To create a new version of the MHQ, we chose to use the concept-retention technique to identify two items most highly correlated with the original MHQ score to include in the final version. This approach has been used successfully for item reduction of other instruments, such as the Disabilities of the Arm, Shoulder, and Hand (DASH) instrument. (29) This methodology has been criticized previously in that it is more subjective compared with other psychometric approaches, such as the Rasch or equidiscriminative techniques. Input from clinicians and patients in survey development can potentially lead to more heterogeneous scales in survey development. However, clinician input is often correlated with greater validity, and is not limited by specific mathematical or statistical parameters, such as item difficulty, which may exclude important items that are clinically relevant. (53, 54)
While the original MHQ remains a robust tool for comprehensive analysis, the brief MHQ will be an important adjunct to the original instrument for collecting data on a large scale. The responsiveness of the brief MHQ in this analysis indicates that it will detect clinical change with similar sensitivity as the full MHQ. For example, for studies with smaller sample size, the comprehensive questions in the 6 scales of the original MQH can minimize “noise” inherent in blunter survey instruments. Conversely, in larger studies, the brief-MHQ is ideal to reduce responder burden and maximize response rates. This shortened instrument will allow the domains of the MHQ to be applied more efficiently and with greater versatility across a broad range of hand diseases. The brief-MHQ will likely excel as a screening tool, or to capture a cross-sectional snapshot of patient outcomes in larger-scale audits of practice outcomes. Surgeons could use the brief-MHQ to document their practice outcomes and follow these longitudinally as a way to compare care against normative values and identify areas for attention or improvement. The brief-MHQ will offer greater efficiency than the original MHQ for this purpose by reducing responder burden and eliminating the need for complicated scoring algorithms.
This study has several notable limitations. First, the brief-MHQ was developed in a group of patients with four specific hand diseases, and its applicability for other conditions requires testing in large field studies. However, the diseases included in this prospective dataset represent both acute and chronic disorders, and include patients with degenerative, inflammatory, and nerve compression syndromes. Additionally, we performed validity testing by comparing our abbreviated instrument against the original instrument. However, there is no existing “gold standard” survey instrument for hand disability for comparison. Objective measures, such as grip and pinch strength, may not correlate with other important aspects of hand dysfunction, such as aesthetic deformity.(20, 55–57) Furthermore, the shortened version of this instrument does not account for laterality, and does not retain the detail of the 6 domains of the original version. Finally, our analysis was conducted retrospectively on a subset of patients in our dataset. This work captures our experience in the initial development of the brief MHQ, and details our efforts to perform item reduction of the MHQ and define psychometric properties of the resulting instrument. Rigorous survey development and item reduction demands a sequential process including systematic review of existing knowledge, qualitative and quantitative preliminary testing with eventual expansion to larger multicenter studies for survey refinement. The brief MHQ will be further refined through a larger, prospective study to describe the logistics of surgery administration and applicability to other conditions.
The brief, 12-item version of the MHQ has the potential for many applications for surgeons in practice, researchers, and policy-makers interested in improving the delivery of care to patients with hand disability. This shortened version has similar psychometric properties as the original MHQ, and demonstrates excellent reliability and validity across a variety of acute and chronic hand conditions. The brief-MHQ will be an essential tool for measuring the quality of care and influencing practice patterns to optimize the practice of hand surgery.
Supported in part by a grant from the National Institute of Arthritis and Musculoskeletal and Skin Diseases (R01 AR047328) and a Midcareer Investigator Award in Patient-Oriented Research (K24 AR053120) (to Dr. Kevin C. Chung).
Nothing to Declare.