Clinical Orthopaedics and Related Research ®

A Publication of The Association of Bone and Joint Surgeons ®

Global Rating Scales and Motion Analysis Are Valid Proficiency Metrics in Virtual and Benchtop Knee Arthroscopy Simulators

Justues Chang MD, Daniel C. Banaszek MD, Jason Gambrel MD, Davide Bardana MD



Work-hour restrictions and fatigue management strategies in surgical training programs continue to evolve in an effort to improve the learning environment and promote safer patient care. In response, training programs must reevaluate how various teaching modalities such as simulation can augment the development of surgical competence in trainees. For surgical simulators to be most useful, it is important to determine whether surgical proficiency can be reliably differentiated using them. To our knowledge, performance on both virtual and benchtop arthroscopy simulators has not been concurrently assessed in the same subjects.


(1) Do global rating scales and procedure time differentiate arthroscopic expertise in virtual and benchtop knee models? (2) Can commercially available built-in motion analysis metrics differentiate arthroscopic expertise? (3) How well are performance measures on virtual and benchtop simulators correlated? (4) Are these metrics sensitive enough to differentiate by year of training?


A cross-sectional study of 19 subjects (four medical students, 12 residents, and three staff) were recruited and divided into 11 novice arthroscopists (student to Postgraduate Year [PGY] 3) and eight proficient arthroscopists (PGY 4 to staff) who completed a diagnostic arthroscopy and loose-body retrieval in both virtual and benchtop knee models. Global rating scales (GRS), procedure times, and motion analysis metrics were used to evaluate performance.


The proficient group scored higher on virtual (14 ± 6 [95% confidence interval {CI}, 10–18] versus 36 ± 5 [95% CI, 32–40], p < 0.001) and benchtop (16 ± 8 [95% CI, 11–21] versus 36 ± 5 [95% CI, 31–40], p < 0.001) GRS scales. The proficient subjects completed nearly all tasks faster than novice subjects, including the virtual scope (579 ±169 [95% CI, 466–692] versus 358 ± 178 [95% CI, 210–507] seconds, p = 0.02) and benchtop knee scope + probe (480 ± 160 [95% CI, 373–588] versus 277 ± 64 [95% CI, 224–330] seconds, p = 0.002). The built-in motion analysis metrics also distinguished novices from proficient arthroscopists using the self-generated virtual loose body retrieval task scores (4 ± 1 [95% CI, 3–5] versus 6 ± 1 [95% CI, 5–7], p = 0.001). GRS scores between virtual and benchtop models were very strongly correlated (ρ = 0.93, p < 0.001). There was strong correlation between year of training and virtual GRS (ρ = 0.8, p < 0.001) and benchtop GRS (ρ = 0.87, p < 0.001) scores.


To our knowledge, this is the first study to evaluate performance on both virtual and benchtop knee simulators. We have shown that subjective GRS scores and objective motion analysis metrics and procedure time are valid measures to distinguish arthroscopic skill on both virtual and benchtop modalities. Performance on both modalities is well correlated. We believe that training on artificial models allows acquisition of skills in a safe environment. Future work should compare different modalities in the efficiency of skill acquisition, retention, and transferability to the operating room.

Back to top