Is a Single-Item Operative Performance Rating Sufficient?

Reed G. Williams, Steven Verhulst, John D. Mellinger, Gary Dunnington

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

Objective A valid measure of resident operative performance ability requires direct observation and accurate rating of multiple resident performances under the normal range of operating conditions. The challenge is to create an operative performance rating (OPR) system that: is easy to use, encourages completion of many ratings immediately after performances and minimally disrupts supervising surgeons' work days. The purpose of this study was to determine whether a score based on a single-item overall OPR provides a valid and stable appraisal of resident operative performances. Design A retrospective comparison of a single-item OPR with a gold-standard rating based on multiple procedure-specific and general OPR items. Setting Data were collected in the general surgery residency program at Southern Illinois University from 2001 through 2012. Participants Assessments of 1033 operative performances (3 common procedures, 2 laparoscopic, and 1 open) by general surgery residents were collected. OPRs based on single-item overall performance scale scores were compared with gold-standard ratings for the same performances. Results Differences in performance scores using the 2 scales averaged 0.02 points (5-point scale). Correlations of the single-item and gold-standard scale scores averaged 0.95. Based on generalizability analyses of laparoscopic cholecystectomy ratings, each instrument required 5 observations to achieve reliabilities of 0.80 and 11 observations to achieve reliabilities of 0.90. Only 4.4% of single-item ratings misclassified the performance when compared with the gold-standard rating and all misclassifications were near misses. For 80% of misclassified ratings, single-item ratings were lower. Conclusions Single-item operative performance measures produced ratings that were virtually identical to gold-standard scale ratings. Misclassifications occurred infrequently and were minor in magnitude. Ratings using the single-item scale: take less time to complete, should increase the sample of procedures rated, and encourage attending surgeons to complete ratings immediately after observing performances. Face-to-face and written comments and suggestions should continue to be used to provide the granular feedback residents need to improve subsequent performances.

Original languageEnglish (US)
Pages (from-to)e212-e217
JournalJournal of Surgical Education
Volume72
Issue number6
DOIs
StatePublished - 2015

Fingerprint

Gold
rating
performance
gold standard
Laparoscopic Cholecystectomy
resident
Internship and Residency
Reference Values
Observation
surgery
rating scale
Surgeons

Keywords

  • general surgery
  • Key Words operative performance evaluation
  • resident training
  • surgical education

ASJC Scopus subject areas

  • Surgery
  • Education

Cite this

Is a Single-Item Operative Performance Rating Sufficient? / Williams, Reed G.; Verhulst, Steven; Mellinger, John D.; Dunnington, Gary.

In: Journal of Surgical Education, Vol. 72, No. 6, 2015, p. e212-e217.

Research output: Contribution to journalArticle

Williams, Reed G. ; Verhulst, Steven ; Mellinger, John D. ; Dunnington, Gary. / Is a Single-Item Operative Performance Rating Sufficient?. In: Journal of Surgical Education. 2015 ; Vol. 72, No. 6. pp. e212-e217.
@article{78f618c010384264b4ea8b66097b93be,
title = "Is a Single-Item Operative Performance Rating Sufficient?",
abstract = "Objective A valid measure of resident operative performance ability requires direct observation and accurate rating of multiple resident performances under the normal range of operating conditions. The challenge is to create an operative performance rating (OPR) system that: is easy to use, encourages completion of many ratings immediately after performances and minimally disrupts supervising surgeons' work days. The purpose of this study was to determine whether a score based on a single-item overall OPR provides a valid and stable appraisal of resident operative performances. Design A retrospective comparison of a single-item OPR with a gold-standard rating based on multiple procedure-specific and general OPR items. Setting Data were collected in the general surgery residency program at Southern Illinois University from 2001 through 2012. Participants Assessments of 1033 operative performances (3 common procedures, 2 laparoscopic, and 1 open) by general surgery residents were collected. OPRs based on single-item overall performance scale scores were compared with gold-standard ratings for the same performances. Results Differences in performance scores using the 2 scales averaged 0.02 points (5-point scale). Correlations of the single-item and gold-standard scale scores averaged 0.95. Based on generalizability analyses of laparoscopic cholecystectomy ratings, each instrument required 5 observations to achieve reliabilities of 0.80 and 11 observations to achieve reliabilities of 0.90. Only 4.4{\%} of single-item ratings misclassified the performance when compared with the gold-standard rating and all misclassifications were near misses. For 80{\%} of misclassified ratings, single-item ratings were lower. Conclusions Single-item operative performance measures produced ratings that were virtually identical to gold-standard scale ratings. Misclassifications occurred infrequently and were minor in magnitude. Ratings using the single-item scale: take less time to complete, should increase the sample of procedures rated, and encourage attending surgeons to complete ratings immediately after observing performances. Face-to-face and written comments and suggestions should continue to be used to provide the granular feedback residents need to improve subsequent performances.",
keywords = "general surgery, Key Words operative performance evaluation, resident training, surgical education",
author = "Williams, {Reed G.} and Steven Verhulst and Mellinger, {John D.} and Gary Dunnington",
year = "2015",
doi = "10.1016/j.jsurg.2015.05.002",
language = "English (US)",
volume = "72",
pages = "e212--e217",
journal = "Journal of Surgical Education",
issn = "1931-7204",
publisher = "Elsevier Inc.",
number = "6",

}

TY - JOUR

T1 - Is a Single-Item Operative Performance Rating Sufficient?

AU - Williams, Reed G.

AU - Verhulst, Steven

AU - Mellinger, John D.

AU - Dunnington, Gary

PY - 2015

Y1 - 2015

N2 - Objective A valid measure of resident operative performance ability requires direct observation and accurate rating of multiple resident performances under the normal range of operating conditions. The challenge is to create an operative performance rating (OPR) system that: is easy to use, encourages completion of many ratings immediately after performances and minimally disrupts supervising surgeons' work days. The purpose of this study was to determine whether a score based on a single-item overall OPR provides a valid and stable appraisal of resident operative performances. Design A retrospective comparison of a single-item OPR with a gold-standard rating based on multiple procedure-specific and general OPR items. Setting Data were collected in the general surgery residency program at Southern Illinois University from 2001 through 2012. Participants Assessments of 1033 operative performances (3 common procedures, 2 laparoscopic, and 1 open) by general surgery residents were collected. OPRs based on single-item overall performance scale scores were compared with gold-standard ratings for the same performances. Results Differences in performance scores using the 2 scales averaged 0.02 points (5-point scale). Correlations of the single-item and gold-standard scale scores averaged 0.95. Based on generalizability analyses of laparoscopic cholecystectomy ratings, each instrument required 5 observations to achieve reliabilities of 0.80 and 11 observations to achieve reliabilities of 0.90. Only 4.4% of single-item ratings misclassified the performance when compared with the gold-standard rating and all misclassifications were near misses. For 80% of misclassified ratings, single-item ratings were lower. Conclusions Single-item operative performance measures produced ratings that were virtually identical to gold-standard scale ratings. Misclassifications occurred infrequently and were minor in magnitude. Ratings using the single-item scale: take less time to complete, should increase the sample of procedures rated, and encourage attending surgeons to complete ratings immediately after observing performances. Face-to-face and written comments and suggestions should continue to be used to provide the granular feedback residents need to improve subsequent performances.

AB - Objective A valid measure of resident operative performance ability requires direct observation and accurate rating of multiple resident performances under the normal range of operating conditions. The challenge is to create an operative performance rating (OPR) system that: is easy to use, encourages completion of many ratings immediately after performances and minimally disrupts supervising surgeons' work days. The purpose of this study was to determine whether a score based on a single-item overall OPR provides a valid and stable appraisal of resident operative performances. Design A retrospective comparison of a single-item OPR with a gold-standard rating based on multiple procedure-specific and general OPR items. Setting Data were collected in the general surgery residency program at Southern Illinois University from 2001 through 2012. Participants Assessments of 1033 operative performances (3 common procedures, 2 laparoscopic, and 1 open) by general surgery residents were collected. OPRs based on single-item overall performance scale scores were compared with gold-standard ratings for the same performances. Results Differences in performance scores using the 2 scales averaged 0.02 points (5-point scale). Correlations of the single-item and gold-standard scale scores averaged 0.95. Based on generalizability analyses of laparoscopic cholecystectomy ratings, each instrument required 5 observations to achieve reliabilities of 0.80 and 11 observations to achieve reliabilities of 0.90. Only 4.4% of single-item ratings misclassified the performance when compared with the gold-standard rating and all misclassifications were near misses. For 80% of misclassified ratings, single-item ratings were lower. Conclusions Single-item operative performance measures produced ratings that were virtually identical to gold-standard scale ratings. Misclassifications occurred infrequently and were minor in magnitude. Ratings using the single-item scale: take less time to complete, should increase the sample of procedures rated, and encourage attending surgeons to complete ratings immediately after observing performances. Face-to-face and written comments and suggestions should continue to be used to provide the granular feedback residents need to improve subsequent performances.

KW - general surgery

KW - Key Words operative performance evaluation

KW - resident training

KW - surgical education

UR - http://www.scopus.com/inward/record.url?scp=84961678105&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84961678105&partnerID=8YFLogxK

U2 - 10.1016/j.jsurg.2015.05.002

DO - 10.1016/j.jsurg.2015.05.002

M3 - Article

VL - 72

SP - e212-e217

JO - Journal of Surgical Education

JF - Journal of Surgical Education

SN - 1931-7204

IS - 6

ER -