Inconsistencies across three contextual meanings of reliability

Oliver C.S. Tzeng, Janet Welch

Research output: Contribution to journalArticlepeer-review

2 Scopus citations


The purpose of this article is to assess the nature of reliability and its inconsistent definitions across three contextual (conceptual, measurement and statistical) levels under the traditional true score theory. Due to such inconsistencies, two existing quantitative approaches (using r and covariance) are not uniformly understood in Psychology and other disciplines; consequently, their applications to measurements and testings are limited to ambiguous interpretations at the conceptual and measurement levels. To examine the extent of this problem, a questionnaire including various contextual definitions and interpretations of reliability in the literature was distributed in a nationwide survey. Results from six groups of experts representing editors, professors and advanced graduate students in both quantitative and clinical areas indicate that all subject groups generally agreed that a reliable instrument possesses the characteristics of the repeatability of responses of all test-takers at the conceptual level, and the reproducibility of the instrument with little or no variations from the underlying true scores at the measurement level. However, between the editors and noneditors, the endorsements of the common definition at the measurement level show obvious discrepancies. Further, at the statistical level, significant differences were found not only between but also within subject-groups in their interpretations of product-moment correlations and Alpha coefficients for the assessment of reliability at the conceptual and measurement levels. The causes of such inconsisten-cies were discussed in terms of the inherent limitations of the two statistical approaches used and their insufficiencies for indexing the conceptual and measurement meanings of reliability. Finally, this paper called for developing new statistical indices that are coherent with conceptual and measurement definitions. Before such development, the capacities of existing reliability indices shall be redefined and their application qualifications shall be proportionally re-established for educational, research and clinical purposes.

Original languageEnglish (US)
Pages (from-to)117-133
Number of pages17
JournalQuality and Quantity
Issue number2
StatePublished - May 1999


  • Classical true score theory
  • Discrepancies among experts
  • Inherent lamellations of r and Alpha
  • Need for new indices
  • Re-establishment of application qualifications
  • Reliability

ASJC Scopus subject areas

  • Statistics and Probability
  • Social Sciences(all)

Fingerprint Dive into the research topics of 'Inconsistencies across three contextual meanings of reliability'. Together they form a unique fingerprint.

Cite this