Source: Wais, M., Ooi, E., Leung, R. M., Vescan, A. D., Lee, J., & Witterick, I. J. (2012). The effect of low-fidelity endoscopic sinus surgery simulators on surgical skill. International Forum of Allergy and Rhinology, 2(1), 20–26. https://doi.org/10.1002/alr.20093.
14 residents randomized into two groups. Experimental group received pretraining session on 5 different modules. Control group did not have any pretraining.
The pretraining consisted of 120 minutes of task training with feedback. There were 5 different endoscopic surgery related tasks. These tasks covered aspects of the surgery including atraumatic scope navigation, depth perception, and familiarity with instruments.
The next day, the residents took part in a cadaveric endoscopic sinus surgery course. They were rated on a series of tasks using a Global Rating Scale (GRS) and a Task-Specific Checklist(TSC).
GRS was mainly process related: respect for tissue, time and motion, instrument handling, and flow of operation.
Residents who received pretraining performed better in the cadaveric tasks compared to those that did not. The assessments were made by blinded expert observer ratings of the video recordings. Senior residents did not perform better than junior residents in the cadaveric tasks. The mean differences were 4.5 points for GRS and 7.2 points for TSC. (What is the maximum score?)
Both the senior and junior residents benefitted from the pretraining.
There was inter- and intra-rater reliability assessments done and calculated using ICC and Pearson correlation coefficient (isn’t this wrong? ←- 2019-04-18 It seems wrong according to Rankin and Stokes. See also http://www.statstutor.ac.uk/resources/uploaded/coventryreliability.pdf ). From the table, intra-rater reliability appears to be better for TSC (0.86 to 0.997) compared to GRS (0.4 to 0.999).
Their spread data presented using SEM. That is not a good thing.