Building team agreement on large population surveys through inter-rater reliability among oral health survey examiners

inter-rater reliability calibration training oral health survey


March 31, 2018


Background: Oral health surveys conducted on a very large population involve many examiners who must be consistent in scoring different levels of an oral disease. Prior to the oral health survey implementation, a measurement of inter-rater reliability (IRR) is needed to know the level of agreement among examiners or raters. Purpose: This study aimed to assess the IRR using consensus and consistency estimates in large population oral health surveys. Methods: A total of 58 dentists participated as raters. The benchmarker showed the clinical sample for dental caries and community periodontal index (CPI) score, with the raters being trained to carry out a calibration exercise in dental phantom. The consensus estimate was measured by means of a percent agreement and Cohen’s Kappa statistic. The consistency estimate of IRR was measured by Cronbach’s alpha coefficient and intraclass correlation. Results: The percent agreement is 65.50% for photographic slides of dental caries, 73.13% for photographic slides of CPI and 78.78% for calibration of dental caries using phantom. There were statistically significant differences between dental caries calibration using photographic slides and phantom (p<0.000), while the consistency of IRR between multiple raters is strong (Cronbrach’s Alpha: >0.9). Conclusion: A percent agreement across multiple raters is acceptable for the diagnosis of dental caries. Consistency between multiple raters is reliable when diagnosing dental caries and CPI.

Most read articles by the same author(s)

1 2 > >>