Introduction: Clinical wear measurement techniques can produce wide-ranging standard deviations (e.g., ±15-90µm) and make detection of small changes for newer composites difficult. While evaluator calibration (level=85%) for USPHS clinical ratings is strongly encouraged, inter and intra-evaluator calibration for indirect wear assessment is not. Objective: Examine inter-evaluator agreement (consensus) associated with clinical wear assessment using Leinfelder method for three long-term (5-10y) posterior composite trials. Methods: Three previously reported studies (S1=FulFil, 1981-1991, Dentsply, n=65; S2=Occlusin, 1987-1992, ICI, n=80; S3=SureFil; 1998-2008, Dentsply, n=60) were included. Each monitored wear using Leinfelder method (impressions; casts; 3 evaluators/trial; ratings equal to or between standards [0-46-92-152-221-272-322-352-382-438-493 µm]). Wear results ranged from 0±15 to 240±73µm. Numerical ratings for each evaluator-restoration-recall-composite combination (N=3564) were transformed (C=central, H=higher, L=lower) (e.g., 156µm-92µm-92µm=H-C-C=2C) and then assessed for consensus (3C=all agree, 2C= two agree, 1C= none agree) for individual evaluators (e1, e2, e3). Average consensus (3C+2C+1C) for individual evaluators (E=[e1+e2+e3]/3) were statistically analyzed (1-way ANOVA, p≤0.05, Bonferroni correction). Restoration averages (R) of evaluator consensus (3C+2C) per restoration were compared to an 85% target. Results: Surprisingly, R-consensus values showed no statistically significant differences (C1, p=0.44; C2, p=0.42; C3, p=0.22) and most R-consensus values (15/21=71%) exceeded 85% calibration targets. H and L variations were almost exclusively limited to a single rating step (3C=0.75; 2C+1C with +1H and/or -1L =0.23). | Baseline | 0.5y | 1y | 2y | 3y | 4y | 5y | 10y | | E-Consensus | S1 | 0.79±0.00 | 0.65±0.05 | 0.68±0.15 | 0.68±0.15 | 0.66±0.09 | ----- | 0.61±0.05 | 0.71±0.11 | S2 | 0.92±0.06 | 0.83±0.07 | 0.79±0.03 | 0.74±0.33 | 0.60±0.15 | 0.71±0.27 | 0.64±0.14 | ----- | S3 | 0.94±0.02 | 0.87±0.11 | 0.87±0.14 | 0.75±0.03 | ----- | 0.62±0.30 | 0.76±0.10 | 0.75±0.11 | | R-Consensus | S1 | 88% | 79% | 92% | 87% | 85% | ----- | 66% | 83% | S2 | 100% | 100% | 96% | 95% | 70% | 91% | 77% | ----- | S3 | 100% | 100% | 100% | 94% | ----- | 71% | 95% | 100% | Conclusion: Strong consensus existed among experienced evaluators despite absence of regular calibration. Acknowledgment: Dentsply-Caulk; ICI. |