|Home | About | Journals | Submit | Contact Us | Français|
Factors governing the appropriateness, reliability and validity of rating scales in the measurement of professional performance are reviewed. The origin and preliminary testing among undergraduated and general practitioners of a brief consultation rating schedule is described.
Statistical criteria are proposed for the analysis of ratings, by groups, in the comparison of consultation performance. Using these criteria the capacity of the 10 rating schedule items to discriminate between two contrasting consultations was examined. Each of the items was used at some time by students or doctors to express significant preference for the same consultation; and on this basis all the items are considered to merit inclusion. One item showed highly significant intra- and inter-observer reliability.
The schedule is reproduced in full, together with a data-collection document and significance chart, with the aim of encouraging groups of doctors to test the validity of the items in the comparison of other pairs of consultations. It is proposed that future versions of the schedule should reflect the experience of such groups in testing existing items and in defining additional items which satisfy the proposed criteria.