|Home | About | Journals | Submit | Contact Us | Français|
The purpose of this study was to systematically review the literature for functional performance tests with evidence of reliability and validity that could be used for a young, athletic population with hip dysfunction.
A search of PubMed and SPORTDiscus databases were performed to identify movement, balance, hop/jump, or agility functional performance tests from the current peer-reviewed literature used to assess function of the hip in young, athletic subjects.
The single-leg stance, deep squat, single-leg squat, and star excursion balance tests (SEBT) demonstrated evidence of validity and normative data for score interpretation. The single-leg stance test and SEBT have evidence of validity with association to hip abductor function. The deep squat test demonstrated evidence as a functional performance test for evaluating femoroacetabular impingement. Hop/Jump tests and agility tests have no reported evidence of reliability or validity in a population of subjects with hip pathology.
Use of functional performance tests in the assessment of hip dysfunction has not been well established in the current literature. Diminished squat depth and provocation of pain during the single-leg balance test have been associated with patients diagnosed with FAI and gluteal tendinopathy, respectively. The SEBT and single-leg squat tests provided evidence of convergent validity through an analysis of kinematics and muscle function in normal subjects. Reliability of functional performance tests have not been established on patients with hip dysfunction. Further study is needed to establish reliability and validity of functional performance tests that can be used in a young, athletic population with hip dysfunction.
2b (Systematic Review of Literature)
Over the past decade, physical therapists have adopted the International Classification of Functioning, Disability, and Health (ICF) model1 to evaluate and treat patients of all levels of health and disability. The ICF model encourages assessment of patients, including athletes, within the context of their function. Therefore, an athlete, or a subject that is regularly participating in an organized sport activity or training, should be evaluated with consideration of the physical demands relative to their sport-related activities. Current evaluation procedures for the hip including range of motion (ROM), strength, and special tests are intended to identify a specific pathology or impairment,2 not necessarily to identify dysfunction. Dysfunction can be defined as pain, asymmetry, or injury that impairs normal movement and performance of a functional activity. Clinicians need to understand which sport-specific functional performance tests can best be used to identify and treat hip dysfunction of an athlete.3
Functional performance tests require the integration of multiple body regions and systems to execute movement patterns and therefore may have an advantage over more traditional clinical measures. Components of ROM, flexibility, muscular strength, endurance, coordination, balance, and motor control of multiple regions can be assessed simultaneously by observing the movement patterns in which the athlete normally functions.4,5,6 Functional performance tests have been commonly used to identify impairments related to ankle7–15 or knee injuries16,17,18–21 and determine the readiness of an athlete to return to sports after injury.22 Similar information has not been established for the hip. Although athletes with hip dysfunction are commonly encountered in a sports medicine rehabilitation setting, currently it is unclear which functional performance tests are most appropriate to use in this population.
To help clinicians employ appropriate functional performance tests in the evaluation of hip dysfunction, the evidence for reliability and validity of the functional performance tests should be considered. Reliability describes how well the test can be reproduced under the same conditions. Validity describes how well a test measures what it is intended to measure. The population in which the evidence for reliability and validity of a functional performance test is established is also an important consideration. Evidence of reliability and validity should be established among a sample of subjects that are similar to the population of patients for which the test is to be used.24 Clinicians need to determine which functional performance tests are valid and reliable for use during evaluation of athletes with hip dysfunction. The purpose of this systematic review was to examine current evidence for reliability and validity of functional performance tests used in a young, athletic population with hip dysfunction.
An electronic search of PubMed and SPORTDiscus databases were performed for the purpose of identifying peer-reviewed articles that utilized functional performance tests to assess function of the hip joint and related structures. The following key words were used in combination for the search of the respective databases: “functional” AND “test”, OR “measure”, OR “assessment”, OR “screen”, AND “lower extremity” OR “hip”. The primary author reviewed the abstracts of the articles identified from the database searches and duplicate studies were removed. The full text articles describing a functional performance test related to the hip joint were retrieved for data extraction and the references reviewed for additional articles of interest. In addition, the individual names of the identified functional performance tests including the deep squat, single-leg squat, single-leg balance, Star Excursion Balance Test, Balance Error Scoring System, Functional Movement Screen™, single leg hop, triple hop, side hop, timed hop, medial hop, lateral hop, figure 8 hop, cross-over hop, square hop, agility T-test, modified agility T-test, and reactive agility testwere searched independently to provide a comprehensive list of functional performance tests used for patients with hip dysfunction.
Studies were included if the following criteria were met: 1) written in English 2) published in a peer-reviewed journal from 1990-present, and 3) described the use of a functional performance test in the assessment of ROM, strength, balance, postural control, or athletic performance of the lower extremity. Excluded studies were those that described patient self-report measures, targeted an elderly population (>60 years of age), or if the functional performance test was performed on subjects with neurological involvement caused by cerebral palsy, a cardiovascular accident, or head injury.
Functional performance tests were grouped into one of four categories: 1) movement 2) balance, 3) hop/jump, or 4) agility. A movement test was defined as a quantitative or qualitative measure of the subject's ability to perform a specific motor pattern with control, precision, and symmetry. Tests of balance measured the patient's ability to maintain balance and postural control under different conditions. Hop/jump tests assessed the quality and/or quantity of tasks related to propulsion and/or absorbing impact. Agility tests assessed the subject's ability to run, cut, pivot, and/or change direction through a predetermined course.
Statistical evidence for test-retest, intra-rater, or inter-rater reliability was recorded as it was originally reported in the identified articles. This included intra-class correlation coefficients (ICC) for interval data or a kappa statistic for ordinal and normative data. An ICC or Kappa statistic can range from 0.0 to 1.0. An ICC or Kappa statistic greater than or equal to .75 are considered excellent, .40 to .74 are moderate, and less than .40 are considered poor.25 Test-retest reliability refers to the agreement of two separate trials of the same test by a single investigator.26 Intra-rater reliability is the agreement of a singular testing session by single investigator. This is usually established through videotaped recordings of a singular testing session. Inter-rater reliability is the agreement of the same test by two separate investigators. Intra-rater reliability and test-retest reliability are similar and often referred interchangeably in the literature. There is, however, an important distinction between the intra-rater and test-retest reliability. Intra-rater reliability isolates intra-rater error as the same performance is evaluated on two separate occasions (i.e. a single evaluator watches the videotape and grades a single test performance two separate times).26 Test-retest reliability also accounts for intra-rater error in addition to the variability of performance on two separate testing sessions (i.e. a single evaluator grades the performance of subjects performing a functional performance test on two separate occasions).26 Test-retest reliability more closely represents how functional performance tests may likely be employed in a clinical environment. The authors of the original articles reviewed for this analysis may not have intended to imply the terminology as defined above. Therefore, when reporting reliability the terminology consistent to that reported in the original article was used.
Evidence of validity of a functional performance test can be established by demonstrating its relationship to subjects with a known dysfunction of the hip joint. Such a relationship may be expressed as a value of sensitivity or specificity in detecting the presence of dysfunction.25 Validity may also be established by a relationship, expressed statistically by a correlation coefficient, to other variables of hip function.25 For instance, a relationship of a functional performance test to ROM or strength values of the hip joint may offer evidence of validity to a functional performance test.
Evidence for score interpretation was also extracted when available. This included normative values as well as the smallest detectable difference (SDD) or the minimal detectable change (MDC) as reported in the original article. The SDD and MDC are often described interchangeably and represent the change or difference between test scores that distinguish error from true changes in the measurement.23
The search results consisted of 18 articles describing movement tests, 24 articles describing balance tests, 26 articles describing hop/jump tests, and 6 articles describing agility tests as shown in Figure 1. Review of the articles revealed four functional performance tests that demonstrated evidence of validity. This included two movement tests (the deep squat test27 and single-leg squat test28,29) and two balance tests (single leg stance test30 and Star Excursion Balance Test).31,32,33 The deep squat was the only functional performance test that was performed on subjects with hip dysfunction.27 During the deep squat test, patients with femoroacetabular impingement (FAI) demonstrated less squat depth and altered lumbo-pelvic kinematics compared to healthy controls.27 The single leg squat, single-leg stance, and the Star Excursion Balance Test (SEBT) were functional performance tests that demonstrated evidence of validity through a relationship to hip abductor muscle function. Subjects graded as “poor” on the single leg squat test exhibited weaker and slower muscle activation of the hip abductors than those graded as “good”.28,29 Provocation of pain during 30-second single-leg stance had high sensitivity (100%) and specificity (97.3%) in detecting tendinopathy of the gluteus medius and minimus.30 For the SEBT, posterior-medial and posterior-lateral reach distances of the SEBT have been correlated to hip abduction and extension strength (r=.48 - .51)31, and the medial reach of the SEBT elicited activation of the gluteus medius at 49% of maximal volitional isometric contraction.33 The SEBT also demonstrated a relationship to hip ROM. Hip flexion ROM was shown to explain a high percentage of variance (92–95%) in SEBT scores.32 Table 1 summarizes the evidence of validity for the use of functional performance tests in subjects with hip dysfunction. There was no evidence of validity for hop/jump or agility tests for patients with hip dysfunction.
None of the functional performance tests demonstrated evidence of reliability in a population of young, athletic patients with hip dysfunction. However, evidence of reliability for functional performance tests in healthy subjects with normative values for score interpretation was identified in 2 movement, 4 balance, 11 hop/jump, and 3 agility tests (Table 2). Ten of these functional performance tests also had established SDD or MDC values on healthy subjects. This information is summarized in Table 2. The authors from a majority of the articles reported moderate to excellent reliability (.61–1.00) of the functional performance tests in groups of healthy subjects. The single leg squat, single leg balance, and star excursion balance test, however, had conflicting evidence of reliability with some reports suggesting poorer reliability (.21–.58) of these specific functional performance tests.
The primary purpose of this study was to systematically review the literature for evidence of reliability and validity of functional performance tests used with a young, athletic population with hip dysfunction. The single-leg stance test, deep squat test, single-leg squat test, and SEBT demonstrated evidence of validity to be used in a population of patients with suspected hip dysfunction. The evidence for validity suggests that gluteal tendinopathy and function of the hip abductors may be assessed with the single-leg stance test,30 single-leg squat test,28,29 and SEBT.32,33 The deep squat test demonstrated evidence of validity as a functional performance test for evaluating patients with suspected FAI.27 Additionally, there were 20 tests that had evidence of reliability in healthy subjects with normative data provided to aide in score interpretation. Clinicians may use this normative data to identify impairments of patients with hip dysfunction that score outside the normal range of healthy subjects on the functional performance tests.
The authors of the current systematic review aimed to identify existing functional performance tests that demonstrated evidence for validity in a population of young, athletic patients with hip dysfunction. Only the single-leg stance test and deep squat test had evidence of validity in patients with confirmed hip dysfunction. The single-leg stance test was performed on subjects with greater than 4 months of lateral hip pain. Provocation of pain during 30-seconds of single leg stance had high sensitivity (100%) and specificity (97.3%) in detecting tendinopathy of the gluteus medius and minimus confirmed by magnetic resonance imaging (MRI).30 Based on the current evidence, the single-leg stance test has clinical value in ruling out other potential sources of lateral hip pain including lumbosacral, sacroiliac, or intra-articular pathology from gluteal tendinopathy, otherwise known as greater trochanteric pain syndrome (GTPS).30 The deep squat test was performed on subjects with radiologically confirmed FAI. The maximal squat depth in subjects with FAI (41% of leg length) was significantly less when compared to healthy controls (32% of leg length).27 Clinicians may test maximum squat depth in patients with suspected FAI to help confirm a diagnosis of FAI. Further studies are needed to determine how the single-leg stance and deep squat tests may compliment current clinical exam procedures to identify the presence of specific hip dysfunction.
While the single-leg stance and deep squat tests provided evidence of validity in subjects with hip dysfunction, the SEBT and single-leg squat test provided evidence of validity through an analysis of kinematics and muscle function in normal subjects. Three studies have related SEBT performance to kinematic and muscle function variables of the hip joint. Hip flexion range of motion was shown to explain a high percentage of variance (92–95%) in SEBT performance.32 Electromyographic study of the gluteus medius demonstrated the medial reach of the SEBT elicited the gluteus medius at 49% of maximal volitional isometric contraction.33 Hip abduction and extension strength also demonstrated a moderate correlation (r =.48 – .51) to posterior-medial and posterior-lateral reach distances of the SEBT.31 The moderate correlation implies that gluteal muscle strength only partially accounts for the variance of SEBT scores. The single-leg squat also demonstrated a relationship to hip abductor muscle function.29 However, the strength of this relationship has been disputed. DiMattia et al.28 reported poor association (r =.21) of the single-leg squat to hip abductor strength. The SEBT and the single-leg squat test have not been studied on a population of patients with hip dysfunction, but may have value to help clinicians screen for ROM and muscle strength impairments.28–33 ROM and strength deficits are commonly associated with hip pathology including FAI,34 osteoarthritis,35 or GTPS.36 A positive finding or asymmetry on the SEBT or single-leg squat test may lead the clinician to perform goniometry or dynamometry to further objectify ROM and strength deficits observed during the functional performance tests.
The reliability of a test is important to be able to confidently interpret the results. There were 2 movement, 4 balance, 11 hop/jump, and 3 agility tests with evidence of reliability in a young, healthy population. The Functional Movement Screen™ (FMS) is a series of seven individual movement tests that have been reliable in screening and evaluating athletes.37 Each test is graded on an ordinal scale based on the ability of the subject to perform specific motor functions.37 The FMS™ is designed to be a comprehensive cross section of functional movement and has been used to predict an athlete's risk for non-specific injury.38 The FMS™ is an intriguing tool for patients with varied hip dysfunction as it tests multiple movement patterns that require different components of hip ROM, strength, and trunk control. Such tests may elicit familiar symptoms or indicate impairments related to FAI, labral tears, osteoarthritis, or GTPS. Clinicians may use normative data established for the FMS™ as a guide to identify abnormal findings on the FMS™ for patients with hip-related dysfunction.37 Further study is needed to determine if the FMS™ is able to accurately predict hip-specific injuries.
Similar to the FMS™, hop/jump tests also demonstrated good to excellent reliability in normal subjects. Hop tests have also shown ability to discriminate injuredfrom uninjured lower extremities, particularly in the assessment of ankle instability and post-operatively following ACL reconstruction.39–42 Researchers have established normative, gender-specific values for hop tests19,43 in young, athletic subjects. These values may serve as benchmarks that may be helpful in interpreting an “abnormal” score for a subject with hip-related dysfunction. Field agility tests have demonstrated evidence of good reliability,19,43–45 but have not been able to discriminate injured versus uninjured limbs in the same manner as hop tests. This is likely because agility tests require bipedal movement. However, agility tests may have value in an athletic population as the tests may more closely mimic the dynamic requirements of sport activity. The reliability of hop/jump and agility tests measures have not been established on patients with hip dysfunction. It remains unclear how patients with hip dysfunction perform on these tests without further study. For patients with unilateral hip symptoms, hop tests may be used in comparison of the uninvolved side. Interpretation of agility test results is limited to a comparison of scores established on healthy subjects. Whether jump/hop tests or agility tests can be used to discriminate subjects with hip-related dysfunction remains unknown.
There are limitations of this systematic review that should be acknowledged. First, the authors established very specific inclusion/exclusion criteria for selection of functional performance tests included in this review. This included only exploring functional performance tests for a young and athletic population. Many tests were excluded because the studies were performed on elderly patients, or subjects with various neurologic or debilitating co-morbidities. Therefore, a number of articles that examined functional performance tests did not fit the inclusion criteria. It is possible some of these functionalperformance tests may have value in a population of subjects with musculoskeletal dysfunction. Given the volume of articles reviewed, it is also possible some functional performance tests were not identified. Significant variations in the descriptions of functional performance tests used by different authors were common. This may alter interpretation of the values attained for a specific functional performance test.
In conclusion, only the deep squat and single-leg stance test demonstrated evidence of validity in a population of patients with hip-related dysfunction. Specifically, diminished squat depth and provocation of pain during the single-leg balance test may be an indication for FAI and gluteal tendinopathy, respectively. The SEBT and single-leg squat test provided evidence of validity through an analysis of kinematics and muscle function in healthy subjects. There were 20 functional performance tests, including the FMS™, with evidence of reliability and normative data to help in score interpretation. None of the articles provided evidence of reliability in a group of subjects with hip-related dysfunction. Without established reliability for these functional performance tests it limits the ability of the clinician to confidently interpret test results and utilize the tests to measure patient progress. The results of the systematic review demonstrated few functional performance tests that have established validity and reliability to compliment traditional clinical exam procedures for patients with hip dysfunction. Further study is needed to establish the reliability and validity of existing functional performance tests or explore new, relevant functional performance tests to be used in a young, athletic population with hip dysfunction.