This paper reports an analysis of factors that reduce agreement in clinical evaluation. Four instructors representing different restorative departments met regularly over a four-month period to study the evaluation process. Each participant selected a task from his discipline, identified critical errors made in performing the task, produced these errors in models and extracted teeth, and developed criteria and checklists to aid in categorizing and counting the errors. In weekly sessions the models were evaluated, and judgments were compared. Factors that reduced agreement were identified, discussed, and listed. In some sessions ongoing evaluation was reported orally, tape-recorded, and analyzed to elucidate the underlying mental processes. Analysis indicated that agreement was reduced by 16 factors, including unclear rules, the faculty member's memory, unstandardized aids to judgment, inconsistent observational methods, differences in ability, and differing tendencies to leniency.