Inconsistencies across three contextual meanings of reliability

O. C S Tzeng, Janet Welch

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

The purpose of this article is to assess the nature of reliability and its inconsistent definitions across three contextual (conceptual, measurement and statistical) levels under the traditional true score theory. Due to such inconsistencies, two existing quantitative approaches (using r and covariance) are not uniformly understood in Psychology and other disciplines; consequently, their applications to measurements and testings are limited to ambiguous interpretations at the conceptual and measurement levels. To examine the extent of this problem, a questionnaire including various contextual definitions and interpretations of reliability in the literature was distributed in a nationwide survey. Results from six groups of experts representing editors, professors and advanced graduate students in both quantitative and clinical areas indicate that all subject groups generally agreed that a reliable instrument possesses the characteristics of the repeatability of responses of all test-takers at the conceptual level, and the reproducibility of the instrument with little or no variations from the underlying true scores at the measurement level. However, between the editors and noneditors, the endorsements of the common definition at the measurement level show obvious discrepancies. Further, at the statistical level, significant differences were found not only between but also within subject-groups in their interpretations of product-moment correlations and Alpha coefficients for the assessment of reliability at the conceptual and measurement levels. The causes of such inconsisten-cies were discussed in terms of the inherent limitations of the two statistical approaches used and their insufficiencies for indexing the conceptual and measurement meanings of reliability. Finally, this paper called for developing new statistical indices that are coherent with conceptual and measurement definitions. Before such development, the capacities of existing reliability indices shall be redefined and their application qualifications shall be proportionally re-established for educational, research and clinical purposes.

Original languageEnglish (US)
Pages (from-to)117-133
Number of pages17
JournalQuality and Quantity
Volume33
Issue number2
StatePublished - May 1999

Fingerprint

Inconsistency
interpretation
editor
Product-moment correlation
Reliability Index
Meaning
Group
Repeatability
Qualification
Reproducibility
Ambiguous
indexing
educational research
Inconsistent
Indexing
Questionnaire
qualification
Discrepancy
university teacher
psychology

Keywords

  • Classical true score theory
  • Discrepancies among experts
  • Inherent lamellations of r and Alpha
  • Need for new indices
  • Re-establishment of application qualifications
  • Reliability

ASJC Scopus subject areas

  • Statistics and Probability
  • Social Sciences(all)

Cite this

Tzeng, O. C. S., & Welch, J. (1999). Inconsistencies across three contextual meanings of reliability. Quality and Quantity, 33(2), 117-133.

Inconsistencies across three contextual meanings of reliability. / Tzeng, O. C S; Welch, Janet.

In: Quality and Quantity, Vol. 33, No. 2, 05.1999, p. 117-133.

Research output: Contribution to journalArticle

Tzeng, OCS & Welch, J 1999, 'Inconsistencies across three contextual meanings of reliability', Quality and Quantity, vol. 33, no. 2, pp. 117-133.
Tzeng, O. C S ; Welch, Janet. / Inconsistencies across three contextual meanings of reliability. In: Quality and Quantity. 1999 ; Vol. 33, No. 2. pp. 117-133.
@article{18897b08e856447a8de848df2a18e0b9,
title = "Inconsistencies across three contextual meanings of reliability",
abstract = "The purpose of this article is to assess the nature of reliability and its inconsistent definitions across three contextual (conceptual, measurement and statistical) levels under the traditional true score theory. Due to such inconsistencies, two existing quantitative approaches (using r and covariance) are not uniformly understood in Psychology and other disciplines; consequently, their applications to measurements and testings are limited to ambiguous interpretations at the conceptual and measurement levels. To examine the extent of this problem, a questionnaire including various contextual definitions and interpretations of reliability in the literature was distributed in a nationwide survey. Results from six groups of experts representing editors, professors and advanced graduate students in both quantitative and clinical areas indicate that all subject groups generally agreed that a reliable instrument possesses the characteristics of the repeatability of responses of all test-takers at the conceptual level, and the reproducibility of the instrument with little or no variations from the underlying true scores at the measurement level. However, between the editors and noneditors, the endorsements of the common definition at the measurement level show obvious discrepancies. Further, at the statistical level, significant differences were found not only between but also within subject-groups in their interpretations of product-moment correlations and Alpha coefficients for the assessment of reliability at the conceptual and measurement levels. The causes of such inconsisten-cies were discussed in terms of the inherent limitations of the two statistical approaches used and their insufficiencies for indexing the conceptual and measurement meanings of reliability. Finally, this paper called for developing new statistical indices that are coherent with conceptual and measurement definitions. Before such development, the capacities of existing reliability indices shall be redefined and their application qualifications shall be proportionally re-established for educational, research and clinical purposes.",
keywords = "Classical true score theory, Discrepancies among experts, Inherent lamellations of r and Alpha, Need for new indices, Re-establishment of application qualifications, Reliability",
author = "Tzeng, {O. C S} and Janet Welch",
year = "1999",
month = "5",
language = "English (US)",
volume = "33",
pages = "117--133",
journal = "Quality and Quantity",
issn = "0033-5177",
publisher = "Springer Netherlands",
number = "2",

}

TY - JOUR

T1 - Inconsistencies across three contextual meanings of reliability

AU - Tzeng, O. C S

AU - Welch, Janet

PY - 1999/5

Y1 - 1999/5

N2 - The purpose of this article is to assess the nature of reliability and its inconsistent definitions across three contextual (conceptual, measurement and statistical) levels under the traditional true score theory. Due to such inconsistencies, two existing quantitative approaches (using r and covariance) are not uniformly understood in Psychology and other disciplines; consequently, their applications to measurements and testings are limited to ambiguous interpretations at the conceptual and measurement levels. To examine the extent of this problem, a questionnaire including various contextual definitions and interpretations of reliability in the literature was distributed in a nationwide survey. Results from six groups of experts representing editors, professors and advanced graduate students in both quantitative and clinical areas indicate that all subject groups generally agreed that a reliable instrument possesses the characteristics of the repeatability of responses of all test-takers at the conceptual level, and the reproducibility of the instrument with little or no variations from the underlying true scores at the measurement level. However, between the editors and noneditors, the endorsements of the common definition at the measurement level show obvious discrepancies. Further, at the statistical level, significant differences were found not only between but also within subject-groups in their interpretations of product-moment correlations and Alpha coefficients for the assessment of reliability at the conceptual and measurement levels. The causes of such inconsisten-cies were discussed in terms of the inherent limitations of the two statistical approaches used and their insufficiencies for indexing the conceptual and measurement meanings of reliability. Finally, this paper called for developing new statistical indices that are coherent with conceptual and measurement definitions. Before such development, the capacities of existing reliability indices shall be redefined and their application qualifications shall be proportionally re-established for educational, research and clinical purposes.

AB - The purpose of this article is to assess the nature of reliability and its inconsistent definitions across three contextual (conceptual, measurement and statistical) levels under the traditional true score theory. Due to such inconsistencies, two existing quantitative approaches (using r and covariance) are not uniformly understood in Psychology and other disciplines; consequently, their applications to measurements and testings are limited to ambiguous interpretations at the conceptual and measurement levels. To examine the extent of this problem, a questionnaire including various contextual definitions and interpretations of reliability in the literature was distributed in a nationwide survey. Results from six groups of experts representing editors, professors and advanced graduate students in both quantitative and clinical areas indicate that all subject groups generally agreed that a reliable instrument possesses the characteristics of the repeatability of responses of all test-takers at the conceptual level, and the reproducibility of the instrument with little or no variations from the underlying true scores at the measurement level. However, between the editors and noneditors, the endorsements of the common definition at the measurement level show obvious discrepancies. Further, at the statistical level, significant differences were found not only between but also within subject-groups in their interpretations of product-moment correlations and Alpha coefficients for the assessment of reliability at the conceptual and measurement levels. The causes of such inconsisten-cies were discussed in terms of the inherent limitations of the two statistical approaches used and their insufficiencies for indexing the conceptual and measurement meanings of reliability. Finally, this paper called for developing new statistical indices that are coherent with conceptual and measurement definitions. Before such development, the capacities of existing reliability indices shall be redefined and their application qualifications shall be proportionally re-established for educational, research and clinical purposes.

KW - Classical true score theory

KW - Discrepancies among experts

KW - Inherent lamellations of r and Alpha

KW - Need for new indices

KW - Re-establishment of application qualifications

KW - Reliability

UR - http://www.scopus.com/inward/record.url?scp=0033121461&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0033121461&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:0033121461

VL - 33

SP - 117

EP - 133

JO - Quality and Quantity

JF - Quality and Quantity

SN - 0033-5177

IS - 2

ER -