Examining rater and occasion influences in observational assessments obtained from within the clinical environment

Clarence D. Kreiter, Adam B. Wilson, Aloysius Humbert, Patricia A. Wade

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Background:When ratings of student performance within the clerkship consist of a variable number of ratings per clinical teacher (rater), an important measurement question arises regarding how to combine such ratings to accurately summarize performance. As previous G studies have not estimated the independent influence of occasion and rater facets in observational ratings within the clinic, this study was designed to provide estimates of these two sources of error. Method: During 2 years of an emergency medicine clerkship at a large midwestern university, 592 students were evaluated an average of 15.9 times. Ratings were performed at the end of clinical shifts, and students often received multiple ratings from the same rater. A completely nested G study model (occasion: rater: person) was used to analyze sampled rating data. Results: The variance component (VC) related to occasion was small relative to the VC associated with rater. The D study clearly demonstrates that having a preceptor rate a student on multiple occasions does not substantially enhance the reliability of a clerkship performance summary score. Conclusions: Although further research is needed, it is clear that case-specific factors do not explain the low correlation between ratings and that having one or two raters repeatedly rate a student on different occasions/ cases is unlikely to yield a reliable mean score. This research suggests that it may be more efficient to have a preceptor rate a student just once. However, when multiple ratings from a single preceptor are available for a student, it is recommended that a mean of the preceptor's ratings be used to calculate the student's overall mean performance score.

Original languageEnglish (US)
Article number29279
JournalMedical Education Online
Volume21
Issue number1
DOIs
StatePublished - 2016

Fingerprint

rating
Students
student
performance
Emergency Medicine
Research
Research Design
medicine
human being
university
teacher

Keywords

  • Clinical ratings
  • Clinical skills
  • Generalizability theory

ASJC Scopus subject areas

  • Medicine(all)
  • Education

Cite this

Examining rater and occasion influences in observational assessments obtained from within the clinical environment. / Kreiter, Clarence D.; Wilson, Adam B.; Humbert, Aloysius; Wade, Patricia A.

In: Medical Education Online, Vol. 21, No. 1, 29279, 2016.

Research output: Contribution to journalArticle

@article{400341b7360a407a941a1aeb561e7a9e,
title = "Examining rater and occasion influences in observational assessments obtained from within the clinical environment",
abstract = "Background:When ratings of student performance within the clerkship consist of a variable number of ratings per clinical teacher (rater), an important measurement question arises regarding how to combine such ratings to accurately summarize performance. As previous G studies have not estimated the independent influence of occasion and rater facets in observational ratings within the clinic, this study was designed to provide estimates of these two sources of error. Method: During 2 years of an emergency medicine clerkship at a large midwestern university, 592 students were evaluated an average of 15.9 times. Ratings were performed at the end of clinical shifts, and students often received multiple ratings from the same rater. A completely nested G study model (occasion: rater: person) was used to analyze sampled rating data. Results: The variance component (VC) related to occasion was small relative to the VC associated with rater. The D study clearly demonstrates that having a preceptor rate a student on multiple occasions does not substantially enhance the reliability of a clerkship performance summary score. Conclusions: Although further research is needed, it is clear that case-specific factors do not explain the low correlation between ratings and that having one or two raters repeatedly rate a student on different occasions/ cases is unlikely to yield a reliable mean score. This research suggests that it may be more efficient to have a preceptor rate a student just once. However, when multiple ratings from a single preceptor are available for a student, it is recommended that a mean of the preceptor's ratings be used to calculate the student's overall mean performance score.",
keywords = "Clinical ratings, Clinical skills, Generalizability theory",
author = "Kreiter, {Clarence D.} and Wilson, {Adam B.} and Aloysius Humbert and Wade, {Patricia A.}",
year = "2016",
doi = "10.3402/meo.v21.29279",
language = "English (US)",
volume = "21",
journal = "Medical Education Online",
issn = "1087-2981",
publisher = "Co-Action Publishing",
number = "1",

}

TY - JOUR

T1 - Examining rater and occasion influences in observational assessments obtained from within the clinical environment

AU - Kreiter, Clarence D.

AU - Wilson, Adam B.

AU - Humbert, Aloysius

AU - Wade, Patricia A.

PY - 2016

Y1 - 2016

N2 - Background:When ratings of student performance within the clerkship consist of a variable number of ratings per clinical teacher (rater), an important measurement question arises regarding how to combine such ratings to accurately summarize performance. As previous G studies have not estimated the independent influence of occasion and rater facets in observational ratings within the clinic, this study was designed to provide estimates of these two sources of error. Method: During 2 years of an emergency medicine clerkship at a large midwestern university, 592 students were evaluated an average of 15.9 times. Ratings were performed at the end of clinical shifts, and students often received multiple ratings from the same rater. A completely nested G study model (occasion: rater: person) was used to analyze sampled rating data. Results: The variance component (VC) related to occasion was small relative to the VC associated with rater. The D study clearly demonstrates that having a preceptor rate a student on multiple occasions does not substantially enhance the reliability of a clerkship performance summary score. Conclusions: Although further research is needed, it is clear that case-specific factors do not explain the low correlation between ratings and that having one or two raters repeatedly rate a student on different occasions/ cases is unlikely to yield a reliable mean score. This research suggests that it may be more efficient to have a preceptor rate a student just once. However, when multiple ratings from a single preceptor are available for a student, it is recommended that a mean of the preceptor's ratings be used to calculate the student's overall mean performance score.

AB - Background:When ratings of student performance within the clerkship consist of a variable number of ratings per clinical teacher (rater), an important measurement question arises regarding how to combine such ratings to accurately summarize performance. As previous G studies have not estimated the independent influence of occasion and rater facets in observational ratings within the clinic, this study was designed to provide estimates of these two sources of error. Method: During 2 years of an emergency medicine clerkship at a large midwestern university, 592 students were evaluated an average of 15.9 times. Ratings were performed at the end of clinical shifts, and students often received multiple ratings from the same rater. A completely nested G study model (occasion: rater: person) was used to analyze sampled rating data. Results: The variance component (VC) related to occasion was small relative to the VC associated with rater. The D study clearly demonstrates that having a preceptor rate a student on multiple occasions does not substantially enhance the reliability of a clerkship performance summary score. Conclusions: Although further research is needed, it is clear that case-specific factors do not explain the low correlation between ratings and that having one or two raters repeatedly rate a student on different occasions/ cases is unlikely to yield a reliable mean score. This research suggests that it may be more efficient to have a preceptor rate a student just once. However, when multiple ratings from a single preceptor are available for a student, it is recommended that a mean of the preceptor's ratings be used to calculate the student's overall mean performance score.

KW - Clinical ratings

KW - Clinical skills

KW - Generalizability theory

UR - http://www.scopus.com/inward/record.url?scp=85007086784&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85007086784&partnerID=8YFLogxK

U2 - 10.3402/meo.v21.29279

DO - 10.3402/meo.v21.29279

M3 - Article

VL - 21

JO - Medical Education Online

JF - Medical Education Online

SN - 1087-2981

IS - 1

M1 - 29279

ER -