K-means+ method for improving gene selection for classification of microarray data

Heng Huang, Kong Zhang, Fei Xiong, Fillia Makedon, Li Shen, Bruce Hettleman, Justin Pearlman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Microarray gene expression techniques have recently made it possible to offer phenotype classification of many diseases. One problem in this analysis is that each sample is represented by quite a large number of genes, and many of them are insignificant or redundant to clarify the disease problem. The previous work has shown that selecting informative genes from microarray data can improve the accuracy of classification. Clustering methods have been successfully applied to group similar genes and select informative genes from them to avoid redundancy and extract biological information from them. A problem with these approaches is that the number of clusters must be given and it is time-consuming to try all possible numbers for clusters. In this paper, a heuristic, called K-means+, is used to address the number of clusters dependency and degeneracy problems. The result of our experiments shows that K-means+ method can automatically partition genes into a reasonable number of clusters and then the informative genes are selected from clusters.

Original languageEnglish (US)
Title of host publication2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts
Pages110-111
Number of pages2
DOIs
StatePublished - 2005
Externally publishedYes
Event2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts - Stanford, CA, United States
Duration: Aug 8 2005Aug 11 2005

Other

Other2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts
CountryUnited States
CityStanford, CA
Period8/8/058/11/05

Fingerprint

Microarrays
Genes
Gene expression
Redundancy
Experiments

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Huang, H., Zhang, K., Xiong, F., Makedon, F., Shen, L., Hettleman, B., & Pearlman, J. (2005). K-means+ method for improving gene selection for classification of microarray data. In 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts (pp. 110-111). [1540562] https://doi.org/10.1109/CSBW.2005.82

K-means+ method for improving gene selection for classification of microarray data. / Huang, Heng; Zhang, Kong; Xiong, Fei; Makedon, Fillia; Shen, Li; Hettleman, Bruce; Pearlman, Justin.

2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts. 2005. p. 110-111 1540562.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Huang, H, Zhang, K, Xiong, F, Makedon, F, Shen, L, Hettleman, B & Pearlman, J 2005, K-means+ method for improving gene selection for classification of microarray data. in 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts., 1540562, pp. 110-111, 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts, Stanford, CA, United States, 8/8/05. https://doi.org/10.1109/CSBW.2005.82
Huang H, Zhang K, Xiong F, Makedon F, Shen L, Hettleman B et al. K-means+ method for improving gene selection for classification of microarray data. In 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts. 2005. p. 110-111. 1540562 https://doi.org/10.1109/CSBW.2005.82
Huang, Heng ; Zhang, Kong ; Xiong, Fei ; Makedon, Fillia ; Shen, Li ; Hettleman, Bruce ; Pearlman, Justin. / K-means+ method for improving gene selection for classification of microarray data. 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts. 2005. pp. 110-111
@inproceedings{28f534928626464aafa44ef55afc74f2,
title = "K-means+ method for improving gene selection for classification of microarray data",
abstract = "Microarray gene expression techniques have recently made it possible to offer phenotype classification of many diseases. One problem in this analysis is that each sample is represented by quite a large number of genes, and many of them are insignificant or redundant to clarify the disease problem. The previous work has shown that selecting informative genes from microarray data can improve the accuracy of classification. Clustering methods have been successfully applied to group similar genes and select informative genes from them to avoid redundancy and extract biological information from them. A problem with these approaches is that the number of clusters must be given and it is time-consuming to try all possible numbers for clusters. In this paper, a heuristic, called K-means+, is used to address the number of clusters dependency and degeneracy problems. The result of our experiments shows that K-means+ method can automatically partition genes into a reasonable number of clusters and then the informative genes are selected from clusters.",
author = "Heng Huang and Kong Zhang and Fei Xiong and Fillia Makedon and Li Shen and Bruce Hettleman and Justin Pearlman",
year = "2005",
doi = "10.1109/CSBW.2005.82",
language = "English (US)",
isbn = "0769524427",
pages = "110--111",
booktitle = "2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts",

}

TY - GEN

T1 - K-means+ method for improving gene selection for classification of microarray data

AU - Huang, Heng

AU - Zhang, Kong

AU - Xiong, Fei

AU - Makedon, Fillia

AU - Shen, Li

AU - Hettleman, Bruce

AU - Pearlman, Justin

PY - 2005

Y1 - 2005

N2 - Microarray gene expression techniques have recently made it possible to offer phenotype classification of many diseases. One problem in this analysis is that each sample is represented by quite a large number of genes, and many of them are insignificant or redundant to clarify the disease problem. The previous work has shown that selecting informative genes from microarray data can improve the accuracy of classification. Clustering methods have been successfully applied to group similar genes and select informative genes from them to avoid redundancy and extract biological information from them. A problem with these approaches is that the number of clusters must be given and it is time-consuming to try all possible numbers for clusters. In this paper, a heuristic, called K-means+, is used to address the number of clusters dependency and degeneracy problems. The result of our experiments shows that K-means+ method can automatically partition genes into a reasonable number of clusters and then the informative genes are selected from clusters.

AB - Microarray gene expression techniques have recently made it possible to offer phenotype classification of many diseases. One problem in this analysis is that each sample is represented by quite a large number of genes, and many of them are insignificant or redundant to clarify the disease problem. The previous work has shown that selecting informative genes from microarray data can improve the accuracy of classification. Clustering methods have been successfully applied to group similar genes and select informative genes from them to avoid redundancy and extract biological information from them. A problem with these approaches is that the number of clusters must be given and it is time-consuming to try all possible numbers for clusters. In this paper, a heuristic, called K-means+, is used to address the number of clusters dependency and degeneracy problems. The result of our experiments shows that K-means+ method can automatically partition genes into a reasonable number of clusters and then the informative genes are selected from clusters.

UR - http://www.scopus.com/inward/record.url?scp=33746871365&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33746871365&partnerID=8YFLogxK

U2 - 10.1109/CSBW.2005.82

DO - 10.1109/CSBW.2005.82

M3 - Conference contribution

SN - 0769524427

SN - 9780769524429

SP - 110

EP - 111

BT - 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts

ER -