Assessment of diagnostic markers by goodness-of-fit tests

Christos Nakas, Constantin Yiannoutsos, Ronald J. Bosch, Chronis Moyssiadis

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Receiver operating characteristic (ROC) curves are useful statistical tools used to assess the precision of diagnostic markers or to compare new diagnostic markers with old ones. The most common index employed for these purposes is the area under the ROC curve (θ) and several statistical tests exist that test the null hypotheses H0: θ=0.5 or H0: θ1 = θ2, in the case of two-marker comparisons, against alternatives of interest. In this paper we show that goodness-of-fit of uniformity of the distribution of the false positive (true positive) rates can be used instead of tests based on the area index. A semi-parametric approach is based on a completely specified distribution of marker measurements for either the healthy (F) or diseased (G) subjects, and this is extended to the two-marker case. We then extend to the one- and two-marker case when neither distribution is specified (the non-parametric case). In general, ROC-based tests are more powerful than goodness-of-fit tests for location differences between the distributions of healthy and diseased subjects. However ROC-based tests are less powerful when location-scale differences exist (producing ROC curves that cross the diagonal) and are incapable of discriminating between healthy and diseased samples when θ = 0.5 but F ≠ G. In these cases, goodness-of-fit tests have a distinct advantage over ROC-based tests. In conclusion, ROC methodology should be used with recognition of its potential limitations and should be replaced by goodness-of-fit tests when appropriate. The latter are a viable alternative and can be used as a 'black box' or as an exploratory first step in the evaluation of novel diagnostic markers.

Original languageEnglish (US)
Pages (from-to)2503-2513
Number of pages11
JournalStatistics in Medicine
Volume22
Issue number15
DOIs
StatePublished - Aug 15 2003

Fingerprint

Goodness of Fit Test
ROC Curve
Operating Characteristics
Diagnostics
Receiver Operating Characteristic Curve
Receiver
Alternatives
Statistical test
Goodness of fit
Black Box
False Positive
Null hypothesis
Uniformity
Distinct
Healthy Volunteers
Methodology
Evaluation

Keywords

  • Area under the curve
  • Goodness-of-fit
  • Receiver operating characteristics
  • ROC

ASJC Scopus subject areas

  • Epidemiology

Cite this

Assessment of diagnostic markers by goodness-of-fit tests. / Nakas, Christos; Yiannoutsos, Constantin; Bosch, Ronald J.; Moyssiadis, Chronis.

In: Statistics in Medicine, Vol. 22, No. 15, 15.08.2003, p. 2503-2513.

Research output: Contribution to journalArticle

Nakas, Christos ; Yiannoutsos, Constantin ; Bosch, Ronald J. ; Moyssiadis, Chronis. / Assessment of diagnostic markers by goodness-of-fit tests. In: Statistics in Medicine. 2003 ; Vol. 22, No. 15. pp. 2503-2513.
@article{e2d349ac952f4291abf6720c6a529d55,
title = "Assessment of diagnostic markers by goodness-of-fit tests",
abstract = "Receiver operating characteristic (ROC) curves are useful statistical tools used to assess the precision of diagnostic markers or to compare new diagnostic markers with old ones. The most common index employed for these purposes is the area under the ROC curve (θ) and several statistical tests exist that test the null hypotheses H0: θ=0.5 or H0: θ1 = θ2, in the case of two-marker comparisons, against alternatives of interest. In this paper we show that goodness-of-fit of uniformity of the distribution of the false positive (true positive) rates can be used instead of tests based on the area index. A semi-parametric approach is based on a completely specified distribution of marker measurements for either the healthy (F) or diseased (G) subjects, and this is extended to the two-marker case. We then extend to the one- and two-marker case when neither distribution is specified (the non-parametric case). In general, ROC-based tests are more powerful than goodness-of-fit tests for location differences between the distributions of healthy and diseased subjects. However ROC-based tests are less powerful when location-scale differences exist (producing ROC curves that cross the diagonal) and are incapable of discriminating between healthy and diseased samples when θ = 0.5 but F ≠ G. In these cases, goodness-of-fit tests have a distinct advantage over ROC-based tests. In conclusion, ROC methodology should be used with recognition of its potential limitations and should be replaced by goodness-of-fit tests when appropriate. The latter are a viable alternative and can be used as a 'black box' or as an exploratory first step in the evaluation of novel diagnostic markers.",
keywords = "Area under the curve, Goodness-of-fit, Receiver operating characteristics, ROC",
author = "Christos Nakas and Constantin Yiannoutsos and Bosch, {Ronald J.} and Chronis Moyssiadis",
year = "2003",
month = "8",
day = "15",
doi = "10.1002/sim.1464",
language = "English (US)",
volume = "22",
pages = "2503--2513",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",
number = "15",

}

TY - JOUR

T1 - Assessment of diagnostic markers by goodness-of-fit tests

AU - Nakas, Christos

AU - Yiannoutsos, Constantin

AU - Bosch, Ronald J.

AU - Moyssiadis, Chronis

PY - 2003/8/15

Y1 - 2003/8/15

N2 - Receiver operating characteristic (ROC) curves are useful statistical tools used to assess the precision of diagnostic markers or to compare new diagnostic markers with old ones. The most common index employed for these purposes is the area under the ROC curve (θ) and several statistical tests exist that test the null hypotheses H0: θ=0.5 or H0: θ1 = θ2, in the case of two-marker comparisons, against alternatives of interest. In this paper we show that goodness-of-fit of uniformity of the distribution of the false positive (true positive) rates can be used instead of tests based on the area index. A semi-parametric approach is based on a completely specified distribution of marker measurements for either the healthy (F) or diseased (G) subjects, and this is extended to the two-marker case. We then extend to the one- and two-marker case when neither distribution is specified (the non-parametric case). In general, ROC-based tests are more powerful than goodness-of-fit tests for location differences between the distributions of healthy and diseased subjects. However ROC-based tests are less powerful when location-scale differences exist (producing ROC curves that cross the diagonal) and are incapable of discriminating between healthy and diseased samples when θ = 0.5 but F ≠ G. In these cases, goodness-of-fit tests have a distinct advantage over ROC-based tests. In conclusion, ROC methodology should be used with recognition of its potential limitations and should be replaced by goodness-of-fit tests when appropriate. The latter are a viable alternative and can be used as a 'black box' or as an exploratory first step in the evaluation of novel diagnostic markers.

AB - Receiver operating characteristic (ROC) curves are useful statistical tools used to assess the precision of diagnostic markers or to compare new diagnostic markers with old ones. The most common index employed for these purposes is the area under the ROC curve (θ) and several statistical tests exist that test the null hypotheses H0: θ=0.5 or H0: θ1 = θ2, in the case of two-marker comparisons, against alternatives of interest. In this paper we show that goodness-of-fit of uniformity of the distribution of the false positive (true positive) rates can be used instead of tests based on the area index. A semi-parametric approach is based on a completely specified distribution of marker measurements for either the healthy (F) or diseased (G) subjects, and this is extended to the two-marker case. We then extend to the one- and two-marker case when neither distribution is specified (the non-parametric case). In general, ROC-based tests are more powerful than goodness-of-fit tests for location differences between the distributions of healthy and diseased subjects. However ROC-based tests are less powerful when location-scale differences exist (producing ROC curves that cross the diagonal) and are incapable of discriminating between healthy and diseased samples when θ = 0.5 but F ≠ G. In these cases, goodness-of-fit tests have a distinct advantage over ROC-based tests. In conclusion, ROC methodology should be used with recognition of its potential limitations and should be replaced by goodness-of-fit tests when appropriate. The latter are a viable alternative and can be used as a 'black box' or as an exploratory first step in the evaluation of novel diagnostic markers.

KW - Area under the curve

KW - Goodness-of-fit

KW - Receiver operating characteristics

KW - ROC

UR - http://www.scopus.com/inward/record.url?scp=0041633635&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0041633635&partnerID=8YFLogxK

U2 - 10.1002/sim.1464

DO - 10.1002/sim.1464

M3 - Article

C2 - 12872305

AN - SCOPUS:0041633635

VL - 22

SP - 2503

EP - 2513

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 15

ER -