Objective:
To assess the internal consistency and interrater reliability of a clinical evaluation exercise (CEX) format that was designed
to be easily utilized, but sufficiently detailed, to achieve uniform recording of the observed examination.
Design:A comparison of 128 CEXs conducted for 32 internal medicine interns by full-time faculty. This paper reports alpha coefficients
as measures of internal consistency and several measures of inter-rater reliability.
Setting:A university internal medicine program. Observations were conducted at the end of the internship year.
Participants:Participants were 32 interns and observers were 12 full-time faculty in the department of medicine. The entire intern group
was chosen in order to optimize the spectrum of abilities represented. Patients used for the study were recruited by the chief
resident from the inpatient medical service based on their ability and willingness to participate.
Intervention:Each intern was observed twice and there were two examiners during each CEX. The examiners were given a standardized preparation
and used a format developed over five years of previous pilot studies.
Measurements and main results:The format appeared to have excellent internal consistency; alpha coefficients ranged from 0.79 to 0.99. However, multiple
methods of determining inter-rater reliability yielded similar results; intraclass correlations ranged from 0.23 to 0.50 and
generalizability coefficients from a low of 0.00 for the overall rating of the CEX to a high of 0.61 for the physical examination
section. Transforming scores to eliminate rater effects and dichotomizing results into pass-fail did not appear to enhance
the reliability results.
Conclusions:Although the CEX is a valuable didactic tool, its psychometric properties preclude reliable assessment of clinical skills
as a one-time observation.
Key words clinical evaluation exercise - inter-rater reliability - education - performance assessment
Received from the Division of General Internal Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania.
Presented in part at the annual meeting of the Society of General Internal Medicine, May 2, 1990, Arlington, Virginia, and
before the American Board of Internal Medicine, Committee on Research and Development, October 16, 1990, Philadelphia, Pennsylvania.
Supported by a grant from the American Board of Internal Medicine.