Approximate measurement invariance in cross-classified rater-mediated assessments

descriptionPublicationkeyboard_double_arrow_right Article 23 Dec 2014Publisher:Frontiers Media SAJournal:Frontiers in Psychology, volume 5 (eissn: 1664-1078,

Copyright policy )

Authors: Ben eKelcey; Dan eMcGinn; Heather eHill; Ben eKelcey;

doi: 10.3389/fpsyg.2014.01469

pmid: 25566145

pmc: PMC4274900

Approximate measurement invariance in cross-classified rater-mediated assessments

- Summary
- Subjects
- Metrics

Abstract

An important assumption underlying meaningful comparisons of scores in rater-mediated assessments is that measurement is commensurate across raters. When raters differentially apply the standards established by an instrument, scores from different raters are on fundamentally different scales and no longer preserve a common meaning and basis for comparison. In this study, we developed a method to accommodate measurement noninvariance across raters when measurements are cross-classified within two distinct hierarchical units. We conceptualized random item effects cross-classified graded response models and used random discrimination and threshold effects to test, calibrate, and account for measurement noninvariance among raters. By leveraging empirical estimates of rater-specific deviations in the discrimination and threshold parameters, the proposed method allows us to identify noninvariant items and empirically estimate and directly adjust for this noninvariance within a cross-classified framework. Within the context of teaching evaluations, the results of a case study suggested substantial noninvariance across raters and that establishing an approximately invariant scale through random item effects improves model fit and predictive validity.

Related Organizations

Keywords

Teaching, random item effects, teaching, BF1-990, measurement invariance, Psychology, multilevel item response models, measurement equivalence, Multilevel Item Response Models, Measurement invariance

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	15
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%