< Back to previous page

Dataset

A meta-analysis on the reliability of comparative judgement

The data and R analysis script with the article "A Meta-Analysis on the Reliability of Comparative Judgement". Comparative Judgement (CJ) aims to improve the quality of performance-based assessments by letting multiple assessors judge pairs of performances. CJ is generally associated with high levels of reliability, but there is also a large variation in reliability between assessments. This study investigates which assessment characteristics influence the level of reliability. A meta-analysis was performed on the results of 49 CJ assessments. Results show that there was an effect of the number of comparisons on the level of reliability. In addition, the probability of reaching an asymptote in the reliability, i.e., the point where large effort is needed to only slightly increase the reliability, was larger for experts and peers than for novices. For reliability levels of .70 between 10 and 14 comparisons per performance are needed. This rises to 26 to 37 comparisons for a reliability of .90.
Publication year:2019
Accessibility:open
Publisher:Zenodo
License:CC-BY-4.0
Format:csv, doc, txt
Keywords: Educational sciences