Аннотация:Ordinal categorical assessments are common in medical practice and in research. Variability in such measurements amongst raters making the assessments can be problematic. In this paper we consider how such variability can be described statistically. We review three current approaches, including kappa-type statistics, loglinear models for agreement, and latent class agreement models, and discuss their limitations. We present a new graphical approach to describing interrater variability that involves a simple frequency distribution display of the category probabilities. The method enables description of interrater variability when raters are a random sample from some population as opposed to the traditional setting in which only a few selected raters provide assessments. Advantages of this approach relative to current approaches include the following: (1) it provides a simple visual summary of the rating data, (2) description is closely linked to familiar methods for describing variability in continuous measurements, (3) interpretation is straightforward, and (4) a large sample of raters can be accommodated with ease. We illustrate the method on simulated ordinal data representing radiologists' ratings of mammography images and on rating data from a national image reading study of mammography screening.