|Home | About | Journals | Submit | Contact Us | Français|
Clinicians use various criteria to diagnose developmental dysplasia of the hip (DDH) in early infancy, but the importance of these various criteria for a definite diagnosis is controversial. The lack of uniform, widely agreed-on diagnostic criteria for DDH in patients in this age group may result in a delay in diagnosis of some patients.
Our purpose was to establish a consensus among pediatric orthopaedic surgeons worldwide regarding the most relevant criteria for diagnosis of DDH in infants younger than 9 weeks.
We identified 212 potential criteria relevant for diagnosing DDH in infants by surveying 467 professionals. We used the Delphi technique to reach a consensus regarding the most important criteria. We then sent the survey to 261 orthopaedic surgeons from 34 countries.
The response rate was 75%. Thirty-seven items were identified by surgeons as most relevant to diagnose DDH in patients in this age group. Of these, 10 of 37 (27%) related to patient characteristics and history, 13 of 37 (35%) to clinical examination, 11 of 37 (30%) to ultrasound, and three of 37 (8%) to radiography. A Cronbach alpha of 0.9 for both iterations suggested consensus among the panelists.
We established a consensus regarding the most relevant criteria for the diagnosis of DDH in early infancy and established their relative importance on an international basis. The highest ranked clinical criteria included the Ortolani/Barlow test, asymmetry in abduction of 20° or greater, breech presentation, leg-length discrepancy, and first-degree relative treated for DDH.
Level IV, diagnostic study. See the Guidelines for Authors for a complete description of levels of evidence.
DDH is one of the most common congenital musculoskeletal conditions . Its most severe form, hip dislocation, occurs in one to two per 1000 of predominantly northwest European ancestry living in the UK, Scandinavia, North America, and Australia . Milder forms occur more frequently, with prevalence estimates ranging from 40 to 60 per 1000 [8, 11, 16, 18] if ultrasound definitions are used. If detected by clinical examination, prevalence estimates between two and 28 per 1000 have been reported for neonatal DDH .
Clinicians use various criteria to diagnose DDH in early infancy . These relate to the clinical examination, patient characteristics and history, and imaging tests . However, there is no consensus on using these criteria for a definitive diagnosis, particularly during the first 8 weeks of life [14, 15, 21], as it has been suggested that abnormal clinical and sonographic findings can resolve spontaneously in this age group [5, 17]. Consequently, some advocate treatment but others might recommend surveillance or no followup . Further, the role of diagnostic imaging has not been fully clarified for DDH. Roposch and Wright suggested that without an ultrasound test, an accurate diagnosis cannot be established . Others base their diagnosis primarily on clinical tests . Uncertainty  in combination with professional enthusiasm  and lack of evidence  could largely explain these different approaches to the diagnosis of DDH. In addition, clinicians bring diverse experiences to the task of diagnosis, which have been obtained within an intellectual framework or paradigm specific to one’s professional education or personal experience.
The lack of uniform, widely agreed-on diagnostic criteria for DDH in this age group has the potential for misdiagnosis as there is no gold standard test . Inconsistencies in diagnosis not only could explain the variations in prevalence estimates of DDH but also why the diagnosis is made very early in some patients but late in others. Because late-diagnosed DDH in general is more likely to be associated with prolonged treatment and the potential for less favorable outcomes, correct and timely diagnosis is the key for successful management of DDH . Variations in diagnosis also have implications for determining the ‘success’ of treatment (by whatever measure) because of differing criteria for inclusion into a study. Thus, one treatment might be ‘successful’ (say in achieving stable reduction) with a high rate with a given treatment in one study yet at a lower rate with the same treatment in another study because the patients are not the same.
Owing to the lack of a gold standard diagnostic criterion for DDH during early infancy, a crucial step toward improved diagnosis of DDH in this age group is the use of widely accepted diagnostic criteria. These could be used to establish the prevalence of DDH, to develop treatment guidelines, and to enable multicenter research by reference to an agreed case definition. It is axiomatic that establishing consensus regarding the diagnostic criteria does not alone ensure validity. However, consensus should be seen as the starting point for establishing criteria that are valid and reliable . Once validated, standardized diagnostic criteria could be used by general practitioners and other healthcare providers to establish the probability of DDH in a manner approaching the practice of clinical experts. The overall goal of our research is to develop a novel diagnostic index for DDH in early infancy. To determine the criteria for such an index, we wanted to incorporate international opinion and consensus. Such an approach would improve the sensibility of the potential index  and encourage greater acceptance among clinicians treating DDH.
The aims of our study were (1) to elicit and pool criteria useful for the diagnosis of DDH in early infancy; (2) to define the most relevant candidate items for further investigation; (3) to discern any consensus among international pediatric orthopaedic surgeons regarding the most relevant items; and (4) to derive a ranking order of these items.
We surveyed all members of the European Paediatric Orthopaedic Society (EPOS) between 2008 and 2010. EPOS is one of the largest societies for children’s orthopaedic specialists with 326 members from more than 30 countries (Fig. 1). We considered only surgeons who treat patients with DDH eligible for this study. We used the tailored design method for survey design and distribution . This method entails a series of principles to improve the response rates of surveys such as a respondent-friendly survey with easy-to-understand language; four contacts made by first-class mail and e-mail, with an additional special contact by telephone or fax; provision of return envelopes with postage stamps; and personalized correspondence.
This study was approved by the Institutional Review Board, the Scientific Committee of EPOS, and the Board of the British Society of Children’s Orthopaedic Surgery (BSCOS).
The first survey generated criteria (items) thought to be important in the diagnosis of DDH. This open-ended questionnaire asked participants to list the variables that, in their perceptions, were reflective of the presence of DDH in infants 8 weeks old or younger (Fig. 2). Suggested headings, used as prompts, included patient history, patient characteristics, physical examination, coexisting diagnoses, and imaging tests. For the purpose of this study, we defined DDH as a condition that warrants either treatment or followup with a pediatric orthopaedic surgeon. The draft survey was pretested on 15 pediatric orthopaedic surgeons for readability, content validity, and face validity. After incorporating their feedback the questionnaire was sent to 295 members of EPOS and 172 members of BSCOS.
Items generated by this survey were complemented by items from a literature search and key informant interviews. For the first survey (item generation), the response rate was 75% (351/467) or 77% for EPOS (226/295) and 73% for BSCOS (125/172). Of those, 43 did not treat patients with DDH or had retired, leading to 289 of 424 (68%) completed questionnaires in total (195 EPOS; 94 BSCOS). We transcribed the returned surveys onto electronic spreadsheets. The cumulative frequencies of all 232 elicited items were recorded and like items were combined, which left 188 pooled independent items (Appendix 1). These items were subject to an item-reduction process  performed by a multinational panel (AR, FH, JHW, NMPC). We combined like items and eliminated redundant items or implausible items. Each expert initially reviewed the 232 items independently and gave feedback regarding the domains created. Each expert then examined all items for redundancy, similarity, and plausibility for each domain. Several steps were performed to accomplish this task, whereas in each step each expert had to agree on the decision made by the whole panel before the next step was entered. Items with a high frequency were not eliminated regardless if the expert panel uniformly considered them irrelevant. This process left us with 37 candidate items (Table 2). We then evaluated these 37 items using the Delphi survey technique. The Delphi survey technique is an anonymous iterative group consensus process that has been used successfully in diagnostic research [6, 20]. With the Delphi technique, the participants never meet face-to-face. Instead, ideas are expressed to participants in the context of a mailed questionnaire. The participants provide their feedback in the form of questionnaires; the process remains anonymous. Several surveys are conducted until consensus is reached. The lack of constraint on location of the participants, a major advantage of the Delphi technique, was particularly useful for this study. We sent two different Delphi surveys to 261 and 220 members of EPOS, respectively, both of which included the candidate items that had been generated in the previous step. The first Delphi survey asked the respondents to rate each of the items for their relative importance on a 10-cm visual analog scale with the anchors ‘completely unimportant’ and ‘extremely important’. Although the diagnosis sometimes is made from two or more criteria that occur simultaneously, for the purpose of this survey respondents were asked to consider all the listed criteria regardless of other abnormalities. They also were asked to list any comments they might have on these specific items, and to list any additional items that might have been missed from the initial phases. Group means and standard deviation were calculated for each item. For the first Delphi survey (iteration 1), the response rate was 76% (197/261) with 71% (156/220) completed questionnaires of those eligible. The second Delphi survey was sent to the same EPOS members and contained the same items plus a mean value and standard deviation (reflecting the average opinion of the panelists) for each. Respondents were asked to evaluate each item in light of this new information. A 10-point Likert scale was used with the anchors ‘completely unimportant’ and ‘extremely important’. For the second Delphi survey (iteration 2), the response rate was 67% (148/220) with 64% (139/217) completed questionnaires of those eligible. Respondents were pediatric orthopaedic surgeons from 34 countries who saw six patients with DDH per week on average. Eighty-seven percent practiced in academic centers. Delphi was terminated after the second iteration as consensus had been reached.
The concept of consensus was defined as a condition of homogeneity of opinion among the panelists. Cronbach α has been used to quantify consistency among participants . Where the responses of the panelists are highly correlated, they are considered internally consistent or homogeneous. Coefficients close to one reflect consistency in the responses of the study participants and this consistency is likely to be observed in other samples selected in the same way from the same population . Although coefficients of 0.7 or greater are regarded as satisfactory, for clinical decisions coefficients of 0.9 or greater are adequate . We decided a priori to terminate the Delphi process when the 0.9 cut-off was reached, or when no change of Cronbach α for the items was observed between iterations. Items were ranked by their means and standard deviations and compared for iterations 1 and 2. Analyses were performed using SAS 9.2 (SAS Institute, Cary, NC, USA).
Of 232 items elicited in the first survey, 183 independent items were pooled after checking for redundancy and combining like items. Of these, 14% concerned patient characteristics and history, 37% clinical examination and comorbidities, 35% ultrasound, and 14% related to radiography (Appendix 1). We then eliminated items if they were implausible (eg, “Graf type II b” or “presence of spina bifida”), vaguely defined (eg, “shallow acetabular edge as seen on ultrasound” or “subluxation”). This process resulted in 37 candidate items in four domains: patient characteristics and history (10/37; 27%), clinical examination and coexisting disorders (13/37; 35%), ultrasound (11/37; 30%), and radiography (3/37; 8%) (Appendix 2).
Cronbach α was 0.88 at both iterations indicating consensus of the panelists regarding the importance of the 37 criteria. Alpha for single domains ranged between 0.8 and 0.9, except for the domain ‘radiography’ in which a coefficient of 0.7 indicated poor consistency (Table 1).
For each domain the order of items seemed clinically reasonable, with the classic diagnostic criteria ranked highest and the most controversial ones (eg, hip click) ranked lowest (Table 2). Half of the 12 highest ranked items related to clinical examination, a third to ultrasound, and the reminder to aspects of patient history (Table 3). The ranking of items in the second iteration was qualitatively not different from those obtained in the first iteration, except that the order of the rankings was slightly different. Ranking only clinical features, the top five features are a positive Ortolani/Barlow test, asymmetry in abduction of 20o or greater, history of breech presentation or breech in utero, leg-length discrepancy/Galeazzi sign, and first-degree family history of DDH treatment (Table 4).
The decision whether to diagnose an infant with DDH is rooted in the orthopaedic surgeon’s judgment about the value and meaning of different criteria suggestive of this diagnosis. These judgments are pivotal as they will determine further management plan (observe, treat, discharge) and ultimately affect clinical outcomes. The lack of uniform diagnostic criteria has the potential for incorrect or delayed diagnoses. Using a consensus technique, we aimed to identify the most relevant diagnostic criteria for DDH in early infancy by an international survey of expert pediatric orthopaedic surgeons.
Our study has potential limitations. First, as clinical knowledge evolves, the opinions of experts change. However, most of the criteria identified in this study have been in use for decades, and those regarded as most relevant are unlikely to change in the near future. The research is aimed at discerning any consensus (rather than the validity) of diagnostic criteria. The criteria need to be validated in additional research. Second, the Delphi technique has been criticized, as opinions can centralize around the most commonly expressed ideas. This can be addressed by a high degree of methodologic precision in data collation and analysis, which we adhered to. Particular attention was paid to the process of item reduction, which was performed by a panel of four international experts with different training and practice backgrounds. Third, inherent to the methodology of this research was that the importance of each criterion was established in isolation, when in clinical practice two or more criteria could occur simultaneously in a patient. However, this does not limit our study because its aim was first to define the nature and working definition of criteria important for the diagnosis of DDH, and second, to define the importance of each of these criteria. Examining combinations of criteria will be subject to future research.
The panelists reached consensus regarding the importance of the 37 criteria after two iterations . Cronbach α was adequate with 0.9, the highest ranked items were identical in both iterations, and the standard deviations notably decreased for each item in iteration 2. Response rates were high for all surveys suggesting that the data are based on a representative sample of international specialist surgeons. The high response rate suggests there is considerable interest in the specialty in a consensus of this type. The high Cronbach α suggests that the pattern of data seen in this study is stable and similar results would be obtained if a second panel selected from the same population of content experts was studied . While our observations are generalizable to pediatric orthopaedic surgeons internationally who treat patients with DDH, we did not include pediatricians, general practitioners, and other healthcare professionals and are unable to quantify if and to what degree their opinions differ from those of pediatric orthopaedic surgeons. However, as pediatric orthopaedic surgeons are considered to provide the final opinions in the diagnosis of DDH, we took a pragmatic approach and focused on clinical experts for this study of diagnostic criteria. Incorporating the opinions of clinicians other than surgeons probably is more likely to establish referral criteria as opposed to definite diagnostic criteria. We did not attempt to reach a standardized case definition of DDH defined by long-term consequences such as osteoarthritis. Such an approach, based on a prognostic definition of disease , would require longitudinal research in which different characteristics of DDH are examined in relation to outcomes of interest.
Specialist surgeons identified 37 potentially important criteria in the diagnosis of DDH during the first 8 weeks of an infant’s life. As expected, these criteria predominantly concerned clinical examination, ultrasound, and patient characteristics and history. Consensus regarding the relative importance of these criteria was established. Delphi was a key factor in avoiding a consensus that might be skewed by one or more persuasive panelists. The movement in the opinions between iterations 1 and 2 (Table 3) appeared to result from the feedback of information describing the group opinion. Half of the top-ranked items related to criteria obtained from clinical examination and a third related to criteria obtained from ultrasound underlining the value of clinical and ultrasound tests. All listed items seem clinically reasonable and are among the most commonly cited diagnostic criteria for DDH . Classic clinical tests such as Ortolani, Barlow, Galeazzi, and asymmetry in abduction were among the highest-ranking criteria (Table 3). This confirms the wide agreement regarding the relevance of these criteria among surgeons internationally. Questionable but commonly cited criteria such as ‘hip click’ or ‘asymmetric groin creases’ were among the lowest ranking items confirming that these criteria are of limited to no value in making a diagnosis . However, they might be of value for initiating referrals from primary care professionals to specialist surgeons.
Several criteria relating to patient characteristics and history, commonly referred to as risk factors for DDH, were examined. The highest-ranking criteria included breech presentation and a positive first-degree family history of DDH. This is in agreement with current clinical guidelines [1, 12]. Consistent with current guidelines [1, 12], other commonly cited factors such as female gender or oligohydramnios were rated less important by the panelists. However, the panelists rated each criterion in isolation when the importance of one criterion actually might increase if it occurs together with another factor. For example, the absolute risk for DDH is between 70 to 120 of 1000 when the factors ‘female gender’ and ‘breech presentation’ occur together in one patient . Understanding the meaning of such combinations will be part of our next research to validate the criteria. The majority of the ultrasound criteria generated in the first survey related to the commonly cited alpha angle and femoral head coverage , for which several thresholds were explored in the Delphi process. The panelist-specified thresholds for femoral head coverage of 45% and 50% are similar to the criteria outlined by Terjesen et al. . As for the alpha angle, the panelists specified three important thresholds –45°, 50°, and 55°. These are in agreement with Graf’s classification, which recommends immediate treatment for hips with an alpha angle less than 45° and surveillance or treatment for hips with an alpha angle less than 55° in infants 8 weeks old or younger . The threshold of 60° was given the lowest rating suggesting that the panelists’ group opinion regarded an alpha angle of 56° as sufficient in this age group. Three radiographic criteria were listed in the item generation survey but their importance was rated as low in the consensus process with a low Cronbach α, confirming that ultrasound is the preferred imaging test in the age group in interest , with no value for radiography.
We have established an international consensus of diagnostic criteria for DDH in early infancy by well-experienced surgeons. Clinicians can now distinguish and choose between criteria internationally regarded as the best for a definite diagnosis and those regarded as unimportant. This consensus will pave the way for additional research to produce a novel diagnostic index. Although these criteria remain to be validated, our findings are useful for clinicians in that we provide them with a reference of their peers’ opinions. Clinicians can determine how their personal preferences for diagnostic criteria differ from others’ preferences. Such a comparison will reassure clinicians either that their practice is mirrored by their peers, or, if not, it will provide a basis for reconsidering their practice. Although we did not establish criteria for DDH defined by long-term consequences such as osteoarthritis, our study provides the groundwork for such research in that it determined what these different characteristics could be.
We thank the members of EPOS and BSCOS for their valuable contributions to this research. We thank Drs Kuldeep Stohr (London, England), Elke Viehweger (Marseille, France), and Antonio Andreacchio (Turin, Italy) for help with data collection. Drs James Hunter, Nottingham, UK; Christopher Bradish, Deborah Eastwood, Robert Hill, David Jones, Aresh Hashemi-Nejad, Fabian Norman-Taylor, Mark Paterson, London, UK; Andrew Wainwright, Oxford, UK; Andrew Howard, Lillian Sung, and James G. Wright, Toronto, Canada, participated in the field test.
One or more of the authors has received funding from Bupa Foundation UK (AR) and Great Ormond Street Hospital Charitable Foundation (AR).
All ICMJE Conflict of Interest Forms for authors and Clinical Orthopaedics and Related Research editors and board members are on file with the publication and can be viewed on request.
Each author certifies that his or her institution approved the human protocol for this investigation, that all investigations were conducted in conformity with ethical principles of research, and that informed consent for participation in the study was obtained.
This work was done at the Institute of Child Health, University College London, London, UK.