|Home | About | Journals | Submit | Contact Us | Français|
The objectives of this study are the following: 1) report passive hip ROM in asymptomatic young adults, 2) report the intra-tester and inter-tester reliability of hip ROM measurements among testers of multiple disciplines, 3) report the results of provocative hip tests and tester agreement.
descriptive epidemiology study
Twenty-eight young adult volunteers without musculoskeletal symptoms, history of disorder or surgery involving the lumbar spine or lower extremities were enrolled and completed the study.
Asymptomatic young adult volunteers completed questionnaires and were examined by two blinded examiners during a single session. The testers were physical therapists and physicians. Hip range of motion and provocative tests were completed by both examiners on each hip.
Inter and intra-rater reliability for ROM and agreement for provocative tests was determined.
Twenty-eight asymptomatic adults with mean age 31 years old (range 18–51 years) and mean modified Harris Hip Score of 99.5 ± 1.5 and UCLA Activity score of 8.8 ± 1.2 completed the study. Intra-rater agreement was excellent for all hip range of motion measurements, with intraclass correlation coefficients (ICCs) ranging from 0.76 to 0.97 with similar agreement if the examiner was a physical therapist or a physician. Excellent inter-rater reliability was found for hip flexion ICC 0.87 (95% CI 0.78 to 0.92), supine internal rotation ICC 0.75 (95% CI 0.60 to 0.84) and prone internal rotation ICC 0.79 (95% CI 0.66 to 0.87). The least reliable measurements were supine hip abduction (ICC 0.34) and supine external rotation (ICC 0.18). Agreement between examiners ranged from 96–100% for provocative hip tests which included the hip impingement, resisted straight leg raise, FABER/Patrick’s and log roll tests.
Specific hip ROM measures show excellent inter-rater reliability and provocative hip tests show good agreement among multiple examiners and medical disciplines. Further studies are needed to assess the utilization of these measurements and tests as a part of a hip screening examination to assess for young adults at risk intra-articular hip disorders prior to the onset of degenerative changes.
A greater understanding of early intra-articular hip disorders prior to the onset of degenerative changes has developed as a result of improved understanding of pathoanatomy, biomechanics, imaging, and hip arthroscopy. Bony abnormalities of the hip such as developmental dysplasia of the hip (DDH) and femoral acetabular impingement (FAI) are thought to contribute to early intra-articular hip disorders and eventually, osteoarthritis.[1–10] To some degree, these bony abnormalities may be detected on physical examination. As a result, there is a growing need to establish a range of values for physical examination measurement for subgroups of symptomatic and asymptomatic adults in order to screen and diagnose individuals at risk for symptomatic early intra-articular hip disorders. If hip range of motion (ROM) and provocative tests can be used as screening tests to identify hips as risk, then quick screenings can be performed and preventative strategies may be implemented.
There are several factors to be considered in utilizing physical examination parameters to detect a hip disorder. The influence of age, gender, positioning during measurement, and active versus passive range of motion (ROM) of the hip have not been adequately documented.[1, 11–17] For example, Simoneau et al found that prone vs. seated position had little effect on measurements of active hip internal rotation, but did have a significant effect on external rotation. Provocative physical examination techniques are commonly utilized to detect hip pain. The number of asymptomatic adults that have a positive provocative hip test(s) is unknown. Further data is needed to determine the presence of positive provocative hip tests in asymptomatic young adults. The establishment of expected examination parameters in asymptomatic adults will lead to the development of hip screening diagnostic tools that can be utilized to determine potential patients at risk for early intra-articular hip disorders.
Although passive hip ROM is often estimated, the standard goniometer has been widely used for both research clinical purposes to document ROM of the hip. The reliability of standard goniometer measurements has been well established.[18–21]
In a review of goniometric measures of the extremities, Gajdosik and colleagues found good reliability in all measures with intra-tester greater than inter-tester reliability.  In addition to ROM, provocative special tests are commonly used by clinicians to assess symptoms and relate them to a hip disorder. Some of the provocative physical examination special tests commonly utilized for assessing hip pain are FABER, Patrick’s, anterior hip impingement tests, logroll, and resisted straight leg raise (SLR) tests. [13–14, 16–17, 22–27] The reliability of these tests has also not been established. Cibulka and Delitto found a significant difference in perceived pain response and reproduction of pain with the FABER test. They concluded that physical therapists should evaluate the sacroiliac joint in patients with hip pain. Ross and colleagues  tested the test-retest reliability of Patrick’s test as a hip range of motion assessment method. The results of this study support the use of Patrick’s test as being a reliable measure of general hip motion when used by an inexperienced tester. No reliability regarding the accuracy of pain provocation in detecting a hip disorder has been documented for this test.
Studies describing patients with hip disorders prior to the onset of degenerative changes (FAI, DDH, acetabular labral tears, chondrosis) are described as young adults between the ages of 18 and less than 60 years of age.[4, 23–24, 26–27, 30–36]
MacDonald et al initially described the clinical evaluation of the symptomatic young adult hip with the impingement test and abductor fatigue for assessment of FAI. Crawford and Villar discussed the current concepts in the management of FAI, establishing signs of FAI by restriction of hip flexion with adduction and internal rotation and positive impingement test. Combining the results of passive hip ROM examination and provocative hip test is important in assessing symptomatic young adults. The presence of ROM limitations or excess beyond common limits and/or positive provocative hip tests in the asymptomatic active young adult population is unknown.
Establishing a range of values for these tests in the asymptomatic young adult population may assist in the diagnosis of early intra-articular hip disorders. The authors of this study participate in a young hip disorder multidisciplinary research and clinical care group at a tertiary university. Members include orthopaedic surgeons with expertise in hip surgery, physiatrists, and physical therapists. Establishing consistency among the examiners in this group is important. The group plans to further develop a series of physical examination tests to be used as a hip screen with the long-term goal of providing a means to assess which individuals are at risk for intra-articular hip disorders prior to the onset of degenerative changes. Because the group will attempt to create a useful hip screening examination, a large number of examinations are anticipated. All examiners will not be available for every hip screening examination date. As a result, establishment of the most reliable tests among a group of examiners is essential. The purposes of this study are the following: 1) report the passive ROM of the hip using standard goniometer measurements asymptomatic young adults between the ages of 18 and 51, 2) report the intra-tester and inter-tester reliability of goniometer ROM measurements with multiple testers of various disciplines, 3) report the results of provocative hip tests performed by multiple testers in this asymptomatic young adult population.
Approval was given by the Human Studies Committee at Washington University School of Medicine prior to recruitment. Volunteers were recruited via fliers and emails at a tertiary university hospital setting. Volunteers between the ages of 18 and 51 years of age without a history of low back, pelvis or lower extremity symptom, disorder, or surgery, were recruited to participate in this study. Other exclusions included previous history of tumor in the lumbar spine, pelvis or lower extremity or a medical condition that would preclude participation. This age range for volunteer recruitment was chosen in order to include subjects that had reached skeletal maturity and were at low risk for degenerative intra-articular hip changes. [4, 23–24, 26–27, 30–32, 34–35]
The examiners included 9 physical therapists, 5 physiatrists, and 2 orthopedic surgeons with expertise in the musculoskeletal physical examination. To promote consistency in goniometer measurements and provocative physical examination tests, all examiners participated in two training sessions conducted by a senior physical therapist (MHH) that included instruction and practice of the testing procedures. The first training session lasted one hour and included all examiners to allow for discussion among examiners. Three weeks later a second training session took place that was conducted by the same physical therapist (MHH). The entire exam was reviewed with each examiner. Performing passive ROM measurements with a goniometer have been found to be a reliable method of assessing the motion.[18–21]
Recorders (medical assistants, residents, and physical therapy students) assisted by recording the angle displayed on the goniometer that the examiner had measured. All of the recorders underwent two training sessions by the first author and a senior physical therapist to attempt to insure the accuracy of the recorder reading the measurement.
Each subject was examined by two examiners during a single session. Prior to the testing session, the subjects completed self-report questionnaires. With the assistance of a recorder, the first examiner completed the examination including ROM and provocative tests of bilateral hips. The side evaluated first was determined by examiner preference. After a short break of approximately five minutes, the second examiner completed the examination. The order of the testers was randomized prior to testing each subject.
Questionnaires included information regarding demographics such as self- reported age, height, and weight, in addition to the UCLA activity and the Modified Harris Hip Score (MHHS). The latter two are validated outcome tools utilized for assessing hip disorders and the associated activity levels, pain, and dysfunction.[38–39] The UCLA activity questionnaire contains descriptive activity levels ranging from 1–10 where a higher score indicates a higher activity level.  In the MHHS, pain represents approximately 48% of the total score (44 points) and function represents about 52% of the total score (47 points). A multiplier of 1.1 provides a total possible score of 100. We incorporated Harris’s  score interpretation scheme which includes: 90–100 excellent, 80–89 good, 70–79 fair, below 70 poor. The authors chose to include these scores in this study to demonstrate the activity level and confirm that the volunteers were free of symptomatic hip or lower extremity disorders causing functional limitations.
Each ROM measurement was completed three different times and recorded by the recorder. The examiner was blinded to his/her own measurements and those of fellow examiners. Both hips of each volunteer were examined.
Using standard goniometer assessment [40–41] modified with measurements taken in the supine position to best replicate assessments perfomed in the clinical setting the following passive end ranges of motion of the hip were performed in supine: 1) flexion, 2) internal rotation with the hip flexed at 90 degrees, 3) external rotation with the hip flexed at 90 degrees, 4) abduction, and 5) adduction. The following passive end ranges of motion of the hip were assessed in prone: 1) extension, 2) internal rotation with the knee flexed at 90 degrees, 3) external rotation with the knee flexed at 90 degrees. The examiner was blinded to his/her own measurement by placing construction paper over the number values on one side of the goniometer.
For each measurement, stabilization was provided by the examiner’s free hand to the adjacent joints or regions, the lumbopelvic region and the knee. The examiner passively moved the lower extremity to determine the end range of motion of the joint. The end of motion was defined as a firm end-feel without any additional motion occurring at the pelvis. Once the end of motion was determined, the limb was held by the assistant. If any motion occurred during this transfer, the examiner started over and placed the hip at the end range of motion. The examiner placed the goniometer at the angle indicated by the ROM but was blinded to the measurement with covering over the values on one side of the goniometer. A recorder then read the measurement and recorded the data. The range of motion was performed and recorded three times for each measure and for both hips.
The following provocative tests were performed: 1) resisted straight leg raise, 2) FABER/Patrick’s test, 3) hip impingement test, 4) log roll test. These tests were chosen based on their wide use in clinical practice among multiple disciplines. Each provocative examination was performed once on each hip by each individual examiner. The report of pain in the groin, lateral hip and posterior pelvis was recorded as a positive provocative test.
Provocative tests were performed with the volunteer in the supine position. The examiner passively positioned the volunteer in the provocative position for the FABER/Patrick’s and hip impingement tests and resisted the active straight leg raise (SLR) of the volunteer. The volunteer was asked if pain was provoked in the groin, lateral hip, or posterior pelvic region. The response of pain was recorded as positive and the region of pain was noted.
Range of motion was measured for the left and right hip of each subject at three trials by each of two examiners. The three measurements were used to calculate intra-rater reliability, where the data from each of the two examiners were treated as independent observations. An additional analysis was performed to reflect intra-rater reliability for each examiner discipline (i.e., physical therapist and clinician). An ancillary analysis was performed with the omission of data from the second examiner, and the resultant intra-rater reliability estimates did not differ significantly from the inclusive data set that is reported. Reliability was calculated for the right and left hip separately, and for both sides combined where data from each hip were treated as independent observations.
The mean of the three measurement trials was used to calculate inter-rater reliability. Reliability was calculated for the entire sample to reflect agreement for two examiners, regardless of their discipline. An additional analysis was performed where data were compared for a subset of 18 patients in which one rater was a physical therapist and the other rater was a clinician.
Provocative tests were performed for a single trial, thus intra-rater reliability could not be calculated. Inter-rater reliability for these measures could not be estimated as the prevalence of positive provocative tests was near zero. All provocative tests were performed as described in several reference texts [13–14, 25] Rater reliability is expressed with one-way random effects intraclass correlation coefficients (ICCs) and corresponding 95% confidence intervals (CIs). The one-way model was used because, although 16 examiners were recruited, only 2 examiners evaluated a given subject. ICCs range from 0 with no agreement to +1 with perfect agreement. In interpretation of the ICC, Landis and colleagues have provided general guidelines as follows: less than 0.4, poor; 0.4 to 0.75, fair to good; and greater than 0.75, excellent. Although arbitrary, these divisions may provide useful benchmarks for interpreting the adequacy of agreement. Where ICCs were calculated, the within-subject coefficient of variation (CV) is also reported to indicate measurement precision, expressed as a percent of the subject’s mean score. Data are reported as mean ± standard deviation (SD).
Ten men and 18 females, primarily Caucasian (96%), healthy, asymptomatic volunteers aged 18–51 years (Table 1) were recruited and completed the study. The average body mass index (BMI) was 24.5 kg/m2 and ranged between underweight to obese (17.5–33.1 kg/m2). The mean MHHS was 99.5 ± 1.5 and UCLA Activity score was 8.8 ± 1.2, confirming that volunteers were active, healthy, and asymptomatic.
Intra-rater agreement was excellent for all hip range of motion measurements, with ICCs ranging from 0.76 to 0.97 (Table 2). For virtually all variables, the confidence intervals surrounding the ICC for the right and left hip overlapped substantially, indicating that intra-rater agreement was not affected by the side tested. Additionally, agreement was similar whether the examiner was a physical therapist or a clinician.
For most tests, within-subject CVs reflected acceptable precision of the measurements from the three trials. Flexion in supine yielded the most precise measurements (CV = 3%), followed by external rotation with hip flexed (CV = 7%), external rotation with knee flexed (CV = 8%), internal rotation with knee flexed (CV = 10%), abduction in supine (CV = 10%), internal rotation with hip flexed (CV = 14%), extension in prone (CV = 16%), and adduction in supine (CV = 21%).
The mean hip passive ROM values for the left and right hip individually and combined are listed in Table 3. Inter-rater reliability ranged from excellent to poor across the hip range of motion measurements and was not influenced by the side tested (Table 3). Excellent agreement among examiners was found for hip flexion in supine with an ICC for both sides combined of 0.87 (95% CI 0.78 to 0.92) and an average between-rater differences of only 6.5° ± 5° (CV = 5%). Excellent agreement and similar between-rater differences were found for internal rotation with hip flexion at 90 degrees in supine (ICC for both sides combined = 0.75, 95% CI 0.60 to 0.84; CV = 20%) and internal rotation with 90 degrees knee flexion in prone (ICC for both sides combined = 0.79, 95% CI 0.66 to 0.87; CV = 18%). The least reliable measurements with extremely poor agreement among examiners was hip abduction in supine (CV = 20%) and hip external rotation with the knee flexed at 90 degrees in prone (CV = 18%) (ICC for both sides combined = 0.34 and 0.18, respectively). Adduction in supine and extension in prone tests yielded poor measurement precision with CVs of 38% and 28%, respectively.
Rater agreement did not change when calculated for the subgroup of subjects examined by a physical therapist at one occasion and a physician at the other (Table 3). The reliability estimates generated for the entire sample are similar to the ICCs comparing measurements from a physical therapist to measurements from a clinician, with substantial overlap in the confidence intervals.
Provocative tests generated negative findings for almost all hips tested, thus reliability estimates could not be calculated. Of the 56 hips tested, 52 were found to have a negative anterior hip impingement test and two were found to have a positive test by both examiners (96% agreement). For the FABER/Patrick’s test, 54 hips were found to be negative and one hip positive by both examiners (98% agreement). Examiners agreed that 54 hips had a negative straight leg raise (96% agreement) and 55 hips had a negative log roll in internal rotation (98% agreement). All fifty-six hips were determined to have a negative log in roll external rotation by both examiners (100% agreement).
The multidisciplinary group was reliable in assessing specific passive hip ROM measurements in asymptomatic adults between the ages of 18 and 51. All ROM parameters tested showed excellent intra-rater reliability. The best reliability between raters were found for supine hip flexion, internal rotation in supine with the hip flexed at 90 degrees, and internal rotation in prone with the knee flexed at 90 degrees. Fair to good reliability between raters was found for hip external rotation in supine with the hip flexed at 90 degrees (ICC 0.67 with 95% CI 0.44, 0.82), hip adduction (ICC 0.61 with 95% CI 0.36, 0.78) and hip extension in prone (ICC 0.49 with 95% CI (0.20, 0.70). The reliability between examiners remained the same when comparing the results among clinicians from different medical disciplines. The results are important for future studies planned to examine large numbers of subjects to determine an appropriate hip screening examination as all examiners will not be available for all potential examination dates. This data suggests that if trained uniformly, multiple examiners from different medical disciplines can reliably measure specific hip ROM measurements with a goniometer. A goniometer is a simple inexpensive device that can be used in a variety of examination environments and is therefore a good choice to use for a screening examination. Detecting bony abnormalities related to osteoarthritis of the hip on physical examination have been established. Steultjens and colleagues studied knee and hip ROM and disability in patients with knee or hip osteoarthritis. Greater levels of disability were found to be associated with less joint range of motion. Some 25% of the variation in disability levels could be accounted for by differences in ROM. In both knee and hip osteoarthritis, flexion of the knee and extension and external rotation of the hip were found to be most closely associated with disability. Birrell et al predicted radiographic hip osteoarthritis from ROM. Reduced internal rotation was found to be the most predictive of radiographic osteoarthritis while reduced flexion was the least predictive. Reduced ROM in all three planes had greater discrimination (sensitivity was 33% for mild to moderate osteoarthritis and 54% for severe osteoarthritis; specificity was 93 and 88% respectively). The authors concluded that reduced ROM was predictive of the presence of osteoarthritis in patients with newly presenting hip pain, and the results of a ROM physical examination could be used to guide decisions regarding radiography.
Data is emerging with regards to physical examination in patients with established intra-articular hip disorders prior to the onset of osteoarthritis (early intra-articular hip disorders). Clohisy et al outlined general guidelines with regards to surgical treatment options to enable surgeons to categorize young adults with hip disorders. The authors found the initial physical examination including passive hip ROM assessment to be critical in establishing an accurate diagnosis and developing an optimal surgical strategy. Several studies discussed hip pain in young adults with DDH and FAI.. [3–4, 27] When these deformities are associated with symptoms, they can be unrecognized, untreated and potentially progress to osteoarthritis.[24, 27]
The impact of ROM measurements that are the extreme of the asymptomatic population (either increased or decreased end ROM) combined with a positive hip provocative special test may enhance decision making for both the diagnosis and treatment of early intra-articular hip disorders. Clohisy and colleagues  studied the clinical evaluation of anterior FAI prior to confirmation with radiographic exam for the evaluation of surgical techniques. Patients with FAI had a positive impingement test with reduced hip flexion and internal rotation. Further, the authors reported 88% of 51 consecutive patients treated surgically for symptomatic FAI complained of groin pain with a hip impingement test. Passive end range hip flexion and internal rotation were limited to 97 and 9 degrees respectively. Another deformity, DDH, has not been studied extensively in association with hip pain in adults. Most of the studies investigating DDH have been done in neonates. Jari et al  found bilateral limitation of hip abduction was not a useful clinical indicator of underlying hip abnormality because of its poor sensitivity. However, unilateral limitation of abduction of the hip was a highly specific for detecting of DDH in neonates. Further, Nunley and colleagues reported a positive hip impingement test in 90% of adult patients with DDH treated surgically with periacetabular osteotomy.
In this study, the small number of positive provocative hip examination tests in this asymptomatic population of adults will serve as a baseline for future screening studies. Though reliability could not be tested, agreement among examiners for positive findings on all of the provocative hip tests studied here was high ranging from 96–100%. Poor reliability between raters was found for some specific hip ROM assessments. These included hip abduction, prone hip external rotation with the knee flexed at 90 degrees.
The measurements of hip abduction and prone hip external rotation with the knee flexed at 90 degrees can be difficult to measure consistently due to the estimation of midline by the examiner for hip abduction and difficulty assessing the end of hip abduction motion. The end feel of hip abduction is often a soft tissue end-feel which may be more subjective that determining a hard end feel. Often the end of motion was determined by the onset of compensatory motion of the pelvis, typically pelvic lateral tilt. Challenges in determining a soft tissue end-feel or motion at the pelvis might have contributed to poor inter-tester reliability. For the prone hip external rotation potentially poor control of tibiofemoral motion may have contributed. A previous study by Harris-Hayes and colleagues showed that prone passive hip ROM examinations may be assessed with error if the examiner uses the tibia as a lever arm to produce passive hip ROM. Stabilization of the tibiofemoral joint was found to be important in taking reliable hip range of motion measurements. The multidisciplinary group in this study was reliable in measuring hip internal rotation in prone but less reliable in measuring hip external rotation in prone. Despite the training session completed by all examiners prior to examining volunteers, the lack of experience in controlling for tibiofemoral joint motion by all examiners may have influenced the consistency of the measurements in prone.
The use of multiple raters to assess reliability is unusual. Because this multidisciplinary hip research group plans to establish a hip screening examination, multiple examiners will be needed to assess a large number of subjects at various times and dates. The study group plans to implement only those examination features that are quick and reliable across medical disciplines and examiners. The group has successfully utilized this method of assessing inter-rater reliability among multiple medical disciplines and examiners for a weight-bearing examination of trunk motion in three planes. Though not specific for hip disorders, this examination may prove to be a useful tool in a screening examination because it assesses active motion in weight-bearing. We did not assess the accuracy of the recorders. There was no reason to expect that accuracy varied across the recorders. However, it is unknown if the use of different recorders biased the results. The blinding of the examiner was considered more important to reduce bias than the use of recorders. Other practical limitations prevented the use of a single recorder for all measurements. These included time (the entire exam required one hour to complete) and convenience for the subjects.
Other weaknesses of this study include the small number of subjects and the uniformity of race and ethnic background. The examinations took one hour per subject and required three clinical personnel (the recorder, the examiner, and the person to hold the extremity while the measurement was taken) to complete. Time and examiner availability precluded enrollment of a greater number of subjects. The specific hip passive ROM comparisons between ethnicities are unknown. The vast majority of subjects successfully recruited for this study were Caucasian. Further studies are needed to determine if there are passive hip ROM differences between race and ethnicities.
Intra-rater reliability for passive hip ROM goniometer measurements was excellent among our multidisciplinary group. Specific passive hip range of motion goniometer measurements (hip flexion, supine hip internal rotation with the hip flexed at 90 degrees, prone hip internal rotation with the knee flexed at 90 degrees) show excellent inter-rater reliability in raters of varying medical disciplines. The inter-rater reliability of hip internal rotation with the hip in flexion is especially important because this motion has been found to be reduced patients treated for FAI [23, 27, 30] and therefore may serve as important measurement to assess in a hip screening examination. Further, there are few asymptomatic adults with positive hip provocative physical examination tests and examiners of multiple disciplines show excellent agreement on these findings. Collectively, these measurements showed good and excellent reliability and agreement among examiners of multiple medical disciplines. Future studies are needed to implement them as part of a hip screening examination to be utilized to detect individuals at risk for early intra-articular hip disorders.
This publication was made possible by Grant Number UL1 RR024992 from the National Center for Research Resources (NCRR), a component of the National Institutes of Health (NIH), and NIH Roadmap for Medical Research. Its contents are solely the responsibility of the authors and do not necessarily represent the official view of NCRR or NIH
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.