A comparison of parametric versus permutation methods with applications to general and temporal microarray gene expression data

Ronghui Xu, Xiaochun Li

Research output: Contribution to journalArticle

22 Citations (Scopus)

Abstract

Motivation: In analyses of microarray data with a design of different biological conditions, ranking genes by their differential 'importance' is often desired so that biologists can focus research on a small subset of genes that are most likely related to the experiment conditions. Permutation methods are often recommended and used, in place of their parametric counterparts, due to the small sample sizes of microarray experiments and possible non-normality of the data. The recommendations, however, are based on classical knowledge in the hypothesis test setting. Results: We explore the relationship between hypothesis testing and gene ranking. We indicate that the permutation method does not provide a metric for the distance between two underlying distributions. In our simulation studies permutation methods tend to be equally or less accurate than parametric methods in ranking genes. This is partially due to the discreteness of the permutation distributions, as well as the non-metric property. In data analysis the variability in ranking genes can be assessed by bootstrap. It turns out that the variability is much lower for permutation than parametric methods, which agrees with the known robustness of permutation methods to individual outliers in the data.

Original languageEnglish (US)
Pages (from-to)1284-1289
Number of pages6
JournalBioinformatics
Volume19
Issue number10
DOIs
StatePublished - Jul 1 2003
Externally publishedYes

Fingerprint

Microarrays
Gene Expression Data
Microarray Data
Gene expression
Permutation
Genes
Gene Expression
Gene
Ranking
Non-normality
Hypothesis Test
Small Sample Size
Microarray Analysis
Hypothesis Testing
Microarray
Sample Size
Bootstrap
Outlier
Experiments
Experiment

ASJC Scopus subject areas

  • Clinical Biochemistry
  • Computer Science Applications
  • Computational Theory and Mathematics

Cite this

A comparison of parametric versus permutation methods with applications to general and temporal microarray gene expression data. / Xu, Ronghui; Li, Xiaochun.

In: Bioinformatics, Vol. 19, No. 10, 01.07.2003, p. 1284-1289.

Research output: Contribution to journalArticle

@article{487ed321b25044b9a8a2eb81802577f7,
title = "A comparison of parametric versus permutation methods with applications to general and temporal microarray gene expression data",
abstract = "Motivation: In analyses of microarray data with a design of different biological conditions, ranking genes by their differential 'importance' is often desired so that biologists can focus research on a small subset of genes that are most likely related to the experiment conditions. Permutation methods are often recommended and used, in place of their parametric counterparts, due to the small sample sizes of microarray experiments and possible non-normality of the data. The recommendations, however, are based on classical knowledge in the hypothesis test setting. Results: We explore the relationship between hypothesis testing and gene ranking. We indicate that the permutation method does not provide a metric for the distance between two underlying distributions. In our simulation studies permutation methods tend to be equally or less accurate than parametric methods in ranking genes. This is partially due to the discreteness of the permutation distributions, as well as the non-metric property. In data analysis the variability in ranking genes can be assessed by bootstrap. It turns out that the variability is much lower for permutation than parametric methods, which agrees with the known robustness of permutation methods to individual outliers in the data.",
author = "Ronghui Xu and Xiaochun Li",
year = "2003",
month = "7",
day = "1",
doi = "10.1093/bioinformatics/btg155",
language = "English (US)",
volume = "19",
pages = "1284--1289",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "10",

}

TY - JOUR

T1 - A comparison of parametric versus permutation methods with applications to general and temporal microarray gene expression data

AU - Xu, Ronghui

AU - Li, Xiaochun

PY - 2003/7/1

Y1 - 2003/7/1

N2 - Motivation: In analyses of microarray data with a design of different biological conditions, ranking genes by their differential 'importance' is often desired so that biologists can focus research on a small subset of genes that are most likely related to the experiment conditions. Permutation methods are often recommended and used, in place of their parametric counterparts, due to the small sample sizes of microarray experiments and possible non-normality of the data. The recommendations, however, are based on classical knowledge in the hypothesis test setting. Results: We explore the relationship between hypothesis testing and gene ranking. We indicate that the permutation method does not provide a metric for the distance between two underlying distributions. In our simulation studies permutation methods tend to be equally or less accurate than parametric methods in ranking genes. This is partially due to the discreteness of the permutation distributions, as well as the non-metric property. In data analysis the variability in ranking genes can be assessed by bootstrap. It turns out that the variability is much lower for permutation than parametric methods, which agrees with the known robustness of permutation methods to individual outliers in the data.

AB - Motivation: In analyses of microarray data with a design of different biological conditions, ranking genes by their differential 'importance' is often desired so that biologists can focus research on a small subset of genes that are most likely related to the experiment conditions. Permutation methods are often recommended and used, in place of their parametric counterparts, due to the small sample sizes of microarray experiments and possible non-normality of the data. The recommendations, however, are based on classical knowledge in the hypothesis test setting. Results: We explore the relationship between hypothesis testing and gene ranking. We indicate that the permutation method does not provide a metric for the distance between two underlying distributions. In our simulation studies permutation methods tend to be equally or less accurate than parametric methods in ranking genes. This is partially due to the discreteness of the permutation distributions, as well as the non-metric property. In data analysis the variability in ranking genes can be assessed by bootstrap. It turns out that the variability is much lower for permutation than parametric methods, which agrees with the known robustness of permutation methods to individual outliers in the data.

UR - http://www.scopus.com/inward/record.url?scp=0038494587&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0038494587&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btg155

DO - 10.1093/bioinformatics/btg155

M3 - Article

C2 - 12835273

AN - SCOPUS:0038494587

VL - 19

SP - 1284

EP - 1289

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 10

ER -