|Home | About | Journals | Submit | Contact Us | Français|
Nearly 340,000 hip fractures occur each year in the U.S. With current demographic trends, the number of hip fractures is expected to double at least in the next 40 years.
The Hip Impact Protection Project (HIP PRO) was designed to investigate the efficacy and safety of hip protectors in an elderly nursing home population. This paper describes the innovative clustered matched-pair research design used in HIP PRO to overcome the inherent limitations of clustered randomization.
Three clinical centers recruited 37 nursing homes to participate in HIP PRO. They were randomized so that the participating residents in that home received hip protectors for either the right or left hip. Informed consent was obtained from either the resident or the resident's responsible party. The target sample size was 580 residents with replacement if they dropped out, had a hip fracture, or died. One of the advantages of the HIP PRO study design was that each resident was his/her own case and control, eliminating imbalances, and there was no confusion over which residents wore pads (or on which hip).
Generalizability of the findings may be limited. Adherence was higher in this study than in other studies because of: (1) the use of a run-in period, (2) staff incentives, and (3) the frequency of adherence assessments. The use of a single pad is not analogous to pad use in the real world and may have caused unanticipated changes in behavior. Fall assessment was not feasible, limiting the ability to analyze fractures as a function of falls. Finally, hip protector designs continue to evolve so that the results generated using this pad may not be applicable to other pad designs. However, information about factors related to adherence will be useful for future studies.
The clustered matched-pair study design avoided the major problem with previous cluster-randomized investigations of this question – unbalanced risk factors between the experimental group and the control group. Because each resident served as his/her own control, the effects of unbalanced risk factors on treatment effect were virtually eliminated. In addition, the use of frequent adherence assessments allowed us to study the effect of various demographic and environmental factors on adherence, which was vital for the assessment of efficacy.
The Hip Impact Protection Project (HIP PRO) investigated the efficacy and safety of hip protectors in the prevention of hip fractures in a nursing home population. In the United States, nearly 340,000 hip fractures occur each year , over 90% of them associated with falls [2,3]. Continued growth in the elderly population is expected to cause the number of hip fractures to increase dramatically, since hip fracture incidence rates increase exponentially with age [4,5]. The number of hip fractures may well double or triple by the middle of this century [6–8]. Further, the highest rates of hip fracture occur in the nursing home setting [9–11], where 50% or more of residents fall each year [12,13]. Effective strategies to reduce fractures in this setting are needed. Given that the energy generated by a fall on the hip from standing height far exceeds that necessary to fracture the hip, a variety of hip protectors have been developed to reduce the chance of a hip fracture from a fall. Prior randomized clinical trials, using both cluster-randomized designs and individual randomization, tested these hip protectors in nursing home settings. However, they have had two major faults. First, poor adherence compromised the success and interpretation of the results. Second, only studies using clustered randomization, i.e., randomization by nursing home or nursing home unit, demonstrated significant hip fracture reduction. Studies that randomized by individual, using primarily the same hip protector, failed to demonstrate significant fracture prevention, suggesting that cluster randomization created significant bias [14–16].
This paper describes the innovative clustered matched-pair research design used in HIP PRO to overcome the inherent shortcomings of clustered randomization and the methods used to promote adherence to wearing the protectors.
Our hypothesis was that trochanteric padding would prevent hip fractures in nursing home residents. Specifically, we hypothesized that an energy shunting and absorbing, trochanteric pad inserted into a side pocket of an undergarment would reduce the incidence of hip fracture on the protected side by 50% compared to the unprotected side. We conducted a randomized controlled trial in nursing home residents in three regions (Boston, MA; St. Louis, MO; and Baltimore, MD).
Nonbed- or chair-bound residents over the age of 65 were recruited if they were long-stay residents and met specified eligibility criteria. All eligible residents with informed consent had a baseline assessment and then a 2-week run-in period. Compliance of at least 66% during the run-in period was required for inclusion in the main study. Each resident was provided undergarments containing a single pocket and protective pad so that they became their own control. The side to be protected was randomly assigned by nursing home to ensure that an equal number of residents had right- and left-sided protection, eliminating potential bias due to any tendency to fall on one side or another. Randomization by nursing home also eliminated the possibility that residents might be given a garment with the pocket on the wrong side. The primary outcome, hip fracture, was confirmed by obtaining either a copy of the X-ray of the hip or a radiologist's report and was reviewed by a fracture adjudication committee consisting of two orthopedic surgeons, a geriatrician and a musculoskeletal radiologist. Hip fracture incidence was compared between protected and unprotected hips using intention-to-treat analysis. The incidence of adverse events (e.g., skin breakdown) was compared between treated and untreated sides. A secondary aim was to identify factors (both resident- and facility-level) related to adherence for the benefit of future studies.
The clustered matched-pair design used in HIP PRO is similar to designs used in other studies of paired observational units (e.g., eyes, legs, hands) in which treatment of one side of the body is paired with a control on the other side of the body . These designs, however, do not usually include the ability to adjust for clustering at different levels (i.e., facility, unit) as was required in HIP PRO.
HIP PRO investigators also considered designs in previous studies of hip pad efficacy using nonclustered randomization at the resident level of cluster randomization by nursing home. Nonclustered randomization at the individual level (to either wear pads or not) has a number of logistic problems as well as ethical consideration of a placebo group despite relative equipoise with regard to this intervention. Also, crossovers might occur with family members buying hip pads for a resident in the `placebo' group. Further, other residents, feeling uncomfortable wearing pads when others do not, might stop wearing them. In addition, there could be difficulty encountered by nursing home staff (especially temporary staff) in keeping the right residents in pads. Crossovers could have a serious detrimental effect on the power of a study (forcing an increase in the sample size by a factor of (1 - λ)2 to achieve the same power, where A is the crossover rate ) by blurring the differences between the treatment groups and reducing any treatment effect. Intention-to-treat analysis includes all residents as randomized, when, with 50% adherence (typical in many studies of hip protectors), one-half of them may no longer be in their assigned treatment group, virtually eliminating any possibility of detecting a treatment effect.
The second type of study design, cluster randomization by nursing home to either wearing protectors or not, also had significant problems. The main problem is the lack of comparability of nursing homes and their resident populations. In non-nursing home designs of individual randomizations, there are enough subjects so that characteristics are randomly allocated with few if any significant differences in baseline characteristics between treatment groups. However, with the randomization of nursing homes, there will be substantially fewer randomized, so that it is likely that significant differences between treatment groups could arise. Further, because recruitment is unpredictable, it is not possible to balance the number of residents in treatment groups, leading to large disparities in size as well as differences in baseline characteristics between treatment groups. Thus, the intention-to-treat analysis should be model-based to adjust for unbalanced factors. In addition to adjustment for observed factors, terms for the nursing homes would have to be included in the model to adjust for unmeasured, but imbalanced, factors. Such a model would be very difficult to fit to the data unless there were a small number of nursing homes and a large number of outcomes (fractures). Finally, in both of these study designs, half of the residents do not receive any intervention, leading to problems in recruiting nursing homes and residents (especially in the design where half of the nursing homes will be in the study but without an intervention).
The choice of a blended clustered matched-pair design in HIP PRO presented two problems considered by investigators. First, hip fractures were expected to occur equally on left and right sides in nursing homes. Based on prior studies, it was known that the incidence of left and right hip fractures is the same . Investigators were aware that if more hip fractures occurred on a side that was less likely to be protected when trying to balance left- and right-sided assignments, it could introduce a serious design flaw. Treatment of one side could affect the outcome of the other side but there was no evidence that wearing a hip pad on one hip would cause residents to fall preferentially to one side. In view of the available alternatives, the matched-pair design was considered the best approach.
The second design problem considered was how many layers of clustering should be taken into account. We allowed for clustering at the nursing home level, because many nursing homes tend to have similar types of residents, based on geography, religious affiliation, ethnic background, or gender. There may also be clustering at the nursing unit level since residents with similar levels of mobility and illness might be grouped together. While it is not possible to determine a priori the intraclass coefficient (which estimates the strength of the cluster effect), there is little loss of statistical efficiency with the inclusion of clustering in the model since, if the intraclass coefficient is essentially zero, the adjustment in the mixed models also is essentially zero.
Each clinical site (Boston, St. Louis, Baltimore) recruited nursing homes to participate in the study based on: (1) number of eligible residents (to maximize the sample size in consideration of the exclusionary criteria), (2) geographic proximity to the site base of operation, (3) racial composition (to include nursing home residents of minority status), (4) willingness of nursing administration and staff to comply with the study requirements, and (5) no evidence of serious deficiencies of care as documented on the government web site www.nursinghomecompare.gov
Once a nursing home agreed to participate, in-service training sessions were conducted for all three nursing shifts, providing background on hip fractures and reviewing the care and handling of hip protector garments. Emphasis was placed on resident comfort, help with dressing, undressing, and toileting, and the importance of adherence (i.e., wearing the garment with the pad at all times).
Subjects were recruited from a population of long-stay residents of U.S. nursing homes, 65 years and older who were not chair- or bed-bound. The other eligibility criteria, related to patient safety, appropriateness for the intervention under study, and ability to follow over time, were:
Eligible residents included a significant number with cognitive deficits who were at risk for hip fractures. Residents with a single prior hip fracture or hip replacement were also eligible. If a resident became unable to participate for more than 5 weeks due to lack of mobility, he/she was dropped from the study.
All residents in a facility were screened to identify those meeting the inclusion criteria. Residents considered cognitively intact by nursing home staff were consented by research staff. Otherwise, informed consent was obtained through the resident's responsible party. The research staff confirmed cognitive status with the Short Blessed Test . After obtaining informed consent, the resident's medical records were reviewed and the eligibility criteria were evaluated.
Eligible residents with informed consent were enrolled in a 2-week adherence run-in period. The resident had to demonstrate proper wearing of the protective underwear for at least four of six unannounced visits to proceed into the main study. A 2-week run-in was used because in our pilot studies, residents with dementia often required 2 weeks to accommodate to the presence of the pad. Among residents for whom consent was obtained from the responsible party, the run-in period also enabled the investigators to determine the residents' `assent' to participate.
Recruitment of eligible residents continued throughout the study to replace residents leaving due to changed mobility status, hospitalization, refusal to wear protective underwear (after passing the runin), or death. The resident was asked to wear the underwear during the day and while in bed at night, except in nursing homes where policy prevented residents from wearing undergarments at night. Adherence and safety were monitored by research staff making unannounced visits at least three times per week (including weekends) across all three nursing shifts. Hip fractures were ascertained over the entire observation period, including the run-in.
Pre-enrollment information on age, sex, ethnicity, weight and height, as well as medical history, was obtained from nursing home charts. Resident functional status was obtained from the Minimum Data Set (MDS), a nationally-mandated recording form, completed by a multidisciplinary team. Seven Activities of Daily Living (ADL) variables (dressing, personal hygiene, toilet use, locomotion on nursing unit, transfer, bed mobility, and eating) were abstracted from the MDS.
The Short Blessed Test was administered to all residents. This cognitive status measure correlates well with the longer Mini-Mental State Examination , and served to identify residents who were not able to provide informed consent and to provide an objective measure of cognitive function.
Nursing homes were randomized according to the matched-pair clustered design to eliminate the necessity of staff remembering individual resident left or right side pad assignments. All enrolled residents in a given nursing home wore the hip protector on the same side. Thus, if a hip fracture was reported in a compliant resident, the side protected was known with a high level of confidence. The side to be protected for a given nursing home was assigned using a minimization strategy  based on the number of eligible beds to generate approximately equal numbers of residents with protection on each side. However, when recruitment in a nursing home was substantially less than the number of eligible beds, an imbalance between right and left hip pads resulted.
In summary, this design allowed each resident to be his or her own control for unmeasured variables that might contribute to the occurrence of hip fracture, such as underlying bone mass, frequency and types of falls, functional status, gait and mobility. This design also eliminated bias due to the differential introduction of fall prevention strategies in recruited nursing homes since both hips were exposed to any intervention.
Each resident was provided with underwear, pads, and replacements. The underwear contained a single pocket on the left or right side that positioned the hip pad over the trochanter. Pads were removed prior to laundering the garment. Several styles of underwear were available including a unisex version, a `fly' version for men, a `snap' version that afforded caregivers an easier approach to changing incontinence products, and a version with an inside baffled pocket for residents with dementia which discouraged the removal of the hip pad. All styles permitted the use of incontinence products with the underwear. The underwear was fitted by the research staff with sizes from small to extra extra large available.
The success of the intervention depended on resident adherence. Various strategies to motivate staff and residents were developed, including frequent `support visits' to the facility, which provided an opportunity to assess resident adherence and to give incentives for nursing home staff.
The ideal monitoring of adherence would have been an accurate accounting of the total number of hours that the hip protectors were worn as a percent of total follow-up time, as recorded by outside observers blinded to the study aims and hypotheses. In HIP PRO, adherence was monitored with unblinded research staff, who also conducted the support visits. Because of this problem, we instituted further quality assurance procedures (see Quality assurance section below). To assess adherence (defined as the hip protector being worn and properly positioned), research staff made three unannounced visits per week to residents, across all three nursing home shifts and days of the week including weekends.
Data were collected from interviews with nursing home administrators and directors of nursing about facility ownership, staff-to-resident ratios, and staff turn-over. Observational assessments of the physical environment were conducted to gather information on factors such as cleanliness, lighting, and floor coverings using a scale from the Therapeutic Environment Screening Survey for Nursing Homes . Medical chart review also collected resident information related to adherence. Finally, interviews with staff and cognitively intact residents provided information on the expected effect of wearing hip protectors and research staff provided their impressions – such as relation of adherence to overall facility investment – for this same purpose.
The primary endpoint, hip fracture, was systematically ascertained to ensure that all events were captured. Fracture ascertainment was the primary responsibility of research assistants during support visits. Research assistants looked for changes in resident functional status and other clinical signs indicative of a fracture and reviewed charts for indication of serious falls or hospitalizations due to fracture. When a suspected hip fracture was detected, research staff collected data from nursing home staff and charts, hospitalization records, and radiograph reports. These data were provided to a Clinical Endpoints Committee for blinded fracture adjudication, using fracture classification guidelines developed at the start of the study. A hip fracture was defined as: any fracture involving the neck of the femur, intertrochanteric region, and subtrochanteric region as far distally as 3 inches below the lesser trochanter; and periprosthetic fractures within 3 inches of the lesser trochanter. The Clinical Endpoints Committee confirmed (or not) the occurrence of hip fracture and identified the type (e.g. intertrochanteric, femoral neck, trochanteric, subtrochanteric), and side of fracture. The Clinical Endpoints Committee did not receive any information concerning which hip was protected or identity of the resident. It was completely independent of the investigators and had no contact with the investigators except to receive material from the Data Coordinating Center. Because the research staff was not unblinded to protected side, we instituted quality assurance procedures to insure complete ascertainment (see below).
The study design posed a challenge in the assessment of adverse events since all residents wore one protector. Without a completely untreated group, attribution of some adverse events, such as functional status changes or behavioral changes, would be difficult to assign. The most common adverse event expected were skin problems under the hip protector. Skin changes on the side of the hip protector were compared to the unprotected side. Research assistants examined residents weekly or consulted nursing staff involved with resident skin care and dressing residents. The Braden Scale for pressure ulcers  was used to grade skin lesions over the trochanteric regions. The Clinical Site Principal Investigator, who was blinded to protected side, judged whether an adverse event was related to the hip protector. To detect falls, research staff reviewed charts, spoke with nursing home staff, and reviewed fall logs kept by facilities.
The following quality assurance procedures were implemented: (1) all staff were trained in data collection procedures and periodically retrained; (2) clinic coordinators verified a sample of baseline chart abstractions originally performed by a research assistant; (3) resident characteristics noted in the chart were verified directly through observation; (4) clinic coordinators routinely verified resident compliance on the same day as research assistants; (5) clinic coordinators periodically reviewed the charts of all enrolled residents without a reported hip fracture to verify that all fractures had been reported; and (6) clinic coordinators reviewed all forms prior to submission to the Data Coordinating Center.
All forms were sent to the Data Coordinating Center (DCC) either by fax or by electronic file transmission. The data were read from the case report forms by Optical Character Recognition/Intelligent Character Recognition (OCR/ICR) techniques, verified, and entered into the study database. Comprehensive editing was performed and queries were returned to the clinical center for resolution.
The DCC was independent of the clinical sites and processed data when received. The DCC also processed the data from the Clinical Endpoints Committee without unmasking the fracture in terms of padded or unpadded side.
Because each resident was his/her own control, the data for a particular individual were paired. To simplify the sample size calculations (especially with a potentially unknown number of nursing homes at each clinical site), we ignored the cluster randomization of each nursing home (as a right hip versus left hip facility, which does not constitute a true treatment randomization) and so were able to approach the primary analysis of proportion with a hip fracture on the protected side versus the proportion on the unprotected side using a McNemar's test of equality of paired proportions . The technique adjusted for clustering of the paired data (see below), allowing more power than estimated for McNemar's test. However, since closed-form sample size formulas were not available for this technique, the more conservative estimates for McNemar's test were used.
Using data from a previous study of nursing home residents in Maryland , we estimated the overall incidence of 5.6 hip fractures per 100 person years of follow-up (or 2.8 hip fractures per hip). In that study, the difference in hip fracture rate between those with and without prior fractures was minimal. Because of the resident replacement feature in HIP PRO, the sample size was calculated in terms of numbers of events and person years. Table 1 shows the resulting estimates and also includes an adjustment for 50% lack of adherence.
Thus, the study was designed to recruit sufficient nursing home residents, with replacement of those who dropped out (i.e., died, moved to another facility, became bed-ridden), to generate 1632 person years of follow-up. This sample size pro- vided 90% power to detect a 50% reduction in hip fracture in protected hips with 50% adherence.
It was not clear what effect previous hip fractures would have on the fracture rate. In ““50% of the cases with a history of a previous hip fracture or hip replacement, a resident would be assigned to wear the hip protector on the side of the previously repaired hip, which has a lower risk of re-fracture than a previously unfractured hip. Counterbalancing this reduced risk of re-fracturing was the increased risk of fracturing a second hip in those individuals with a history of previous hip fracture. This effect will be investigated in secondary analysis.
As described above, the primary outcome of this study was hip fracture using McNemar's test for binomial proportions for matched-pair data as the main analytic technique. For McNemar's test, we classified each hip as protected or unprotected and fractured or not.
Since these data are not only matched-pair data, but also clustered within nursing home, adjustments for clustering were made to McNemar's test. We considered the statistical procedures developed by Obuchowski  and Durkalski . Obuchowski's approach avoids having to assume a constant within-cluster correlation, while Durkalski's approach does assume a constant within-cluster correlation. Obuchowski's test statistic for clustered matched-pair data takes into account the possible variation in correlation between units within a cluster. Since the Obuchowski procedure is more flexible than the Durkalski procedure, we used it for the primary analysis.
Because of the clustered matched-pair design, we used mixed models to investigate the effect of resident-level factors on the outcome of the hip fracture as secondary analyses . Mixed models are able to adjust for the multiple layers of clustering – including the clinical center, the nursing home and the unit within the nursing home. Because of our clustered matched-pair design, there were no differences in characteristics between the protected and unprotected hips with the exception of treatment side (left vs. right). Any factors that have an effect on falling and fracture will be unrelated to a resident being a case or a control. For example, women may fall more than men and, because of osteoporosis, may be at higher risk of fracture, but this will be independent of treatment. However, factors at the facility and unit level are not necessarily balanced, such as staff- to-resident ratio, facility physical characteristics (e.g., rugs vs. tile floors), or facility ownership.
The secondary aim of HIP PRO was to investigate the effect of resident- and facility-level characteristics on adherence. To investigate the effect of various characteristics on adherence, we used the proportion of all support visits to a resident in which the resident was adherent; thus, if the nursing home was visited 12 times during a month, and a particular resident was found to be wearing the undergarment on 10 of those visits, the resident's adherence for the month was 10/12 or 0.83. Because adherence can change, we included a monthly assessment of adherence. Because a resident was assessed monthly, we used mixed models that adjusted for this additional level of clustering caused by intra-individual correlation .
Because it is doubtful that any hip fractures (the primary outcomes) were missed by the research staff, no strategies were put into place for imputation for missing data. Similarly, adherence (the secondary outcome) was measured several times per week for each resident, so that if the information was missing at a visit, it was assessed at other visits that week.
We learned several lessons that will influence the next study in hip fracture prevention. First, recruitment was slower than anticipated (requiring a 6-month extension) with resistance because only one hip was protected for each resident. However, the practice of continual recruitment to replace dropouts worked well. The frequent adherence assessments also worked well and provided the necessary contact required to correct patient wearing habits and to establish trust between research staff and residents. Frequent contact was important and kept residents wearing pads routinely, contributing to the success of the intervention.
One of the limitations of this study and other studies of hip protectors is related to fall ascertainment. The vast majority of resident falls were unwitnessed and there was no way to document the number of falls or their direction. Even witnessed falls were often difficult to recreate in the observer's mind. An additional problem was whether backward falls fractured hips that were protected by a pad. Unobserved backward falls could affect power estimates and the estimated efficacy of the protector. Tracking the falls that did not result in fractures, a logical denominator for the number of fractures, was not possible without some form of constant surveillance, such as closed-circuit cameras or pressure sensors at strategic locations on the resident.
Generalizability of our findings may be limited in several ways. For a variety of reasons, adherence was higher in this study than in others (and probably in routine nursing home settings). The use of a run-in period eliminated residents who were likely to be nonadherent. A variety of staff incentives kept nursing home staff interested and engaged. Perhaps the most important limitation is that we conducted frequent adherence assessment visits, which lead to higher adherence than seen in other hip pad studies. In addition, the use of a single pad on residents is not analogous to normal use in nursing homes and may have adversely affected acceptance and adherence. Wearing a single pad may also have caused unanticipated changes in behavior, which could have increased the frequency or influenced the direction of the falls. Finally, hip pad designs continue to evolve as do fall prevention measures so that any treatment effect seen with this pad may not be applicable to other pad designs or even to the same pad in the future. Nevertheless, information about factors related to adherence will be useful for nursing home studies of other pads.
HIP PRO's unique clustered matched-pair design has implications for other trials in which one individual's side is compared to the other (e.g., legs, eyes, ears). In this study, each person was his/her own `case' (protected hip) and own `control' (unprotected hip). With the primary difference being treatment side (left vs. right), there were no differences in the distribution of relevant risk factor characteristics between the cases and controls. While factors such as gender and bone strength might be related to fracture or a nursing home fall prevention program might be relate to the number of falls, these factors were rendered independent of treatment because each resident was both a case and a control. Further, because of the clustered matched-pair design, mixed models were used to investigate the effect of resident-level factors on the outcome of the hip fracture. Mixed models were also able to adjust for the multiple layers of clustering (i.e., clinical site, nursing home, and unit within the nursing home). Therefore, compared to earlier studies, HIP PRO results are more likely to represent a true treatment effect, rather than one confounded by experimental design.
The second unique study design feature was our frequency of adherence assessment, lacking in previous studies. A clinical trial that establishes efficacy of a treatment may be of little benefit if adherence was too low to support the treatment effect. By obtaining frequent adherence assessments coincident with the trial of efficacy, information is already available to support the validity of outcome results.
The authors wish to thank the members of the Clinical Endpoint Committee for their efforts in the blinded adjudication of hip fractures: Colleen Christmas, M.D. (Johns Hopkins School of Medicine, Baltimore, MD), Perry Colvin, M.D. (VA Medical Center, Baltimore, MD), Lawrence Holder, M.D. (retired), Roger Michael, M.D. (retired),and Robert Sterling, M.D. (University of Maryland School of Medicine, Baltimore, MD).
The authors also wish to thank the members of the Data and Safety Monitoring Committee for their efforts and oversight of the study: Dorothy Baker, Ph.D. (Yale School of Medicine, New Haven, CT), Mark A. Espeland, Ph.D. (Wake Forest University School of Medicine, Winston-Salem, NC), James O. Judge, M.D. (Masonicare, Wallingford, CT), Michael C. Nevitt, Ph.D., M.P.H., Chair (University of California, San Francisco, San Francisco, CA), and Laurence Rubenstein, M.D. (VA Medical Center, Sepulveda, CA).
This study was funded by a grant from the National Institutes of Health (R01 AG018461) and support in part by the Lawrence J. and Anne Cable Rubenstein Charitable Foundation.