MRHCA: a nonparametric statistics based method for hub and co-expression module identification in large gene co-expression network

Yu Zhang, Sha Cao, Jing Zhao, Burair Alsaihati, Qin Ma, Chi Zhang

Research output: Contribution to journalArticle

Abstract

Background: Gene co-expression and differential co-expression analysis has been increasingly used to study cofunctional and co-regulatory biological mechanisms from large scale transcriptomics data sets. Methods: In this study, we develop a nonparametric approach to identify hub genes and modules in a large coexpression network with low computational and memory cost, namely MRHCA. Results: We have applied the method to simulated transcriptomics data sets and demonstrated MRHCA can accurately identify hub genes and estimate size of co-expression modules. With applying MRHCA and differential coexpression analysis to E. coli and TCGA cancer data, we have identified significant condition specific activated genes in E. coli and distinct gene expression regulatory mechanisms between the cancer types with high copy number variation and small somatic mutations. Conclusion: Our analysis has demonstrated MRHCA can (i) deal with large association networks, (ii) rigorously assess statistical significance for hubs and module sizes, (iii) identify co-expression modules with low associations, (iv) detect small and significant modules, and (v) allow genes to be present in more than one modules, compared with existing methods. [Figure not available: see fulltext.].

Original languageEnglish (US)
Pages (from-to)40-55
Number of pages16
JournalQuantitative Biology
Volume6
Issue number1
DOIs
StatePublished - Mar 1 2018

Fingerprint

Nonparametric Statistics
Genes
Statistics
Gene
Gene Expression
Module
Escherichia coli
Gene Regulatory Networks
Escherichia Coli
Neoplasms
Cancer
Costs and Cost Analysis
Mutation
Gene expression
Statistical Significance
Figure
Data storage equipment
Distinct
Datasets
Costs

Keywords

  • algorithm for large scale networks analysis
  • gene co-expression network
  • Mutual Rank
  • statistical significance of gene co-expression

ASJC Scopus subject areas

  • Modeling and Simulation
  • Biochemistry, Genetics and Molecular Biology (miscellaneous)
  • Computer Science Applications
  • Applied Mathematics

Cite this

MRHCA : a nonparametric statistics based method for hub and co-expression module identification in large gene co-expression network. / Zhang, Yu; Cao, Sha; Zhao, Jing; Alsaihati, Burair; Ma, Qin; Zhang, Chi.

In: Quantitative Biology, Vol. 6, No. 1, 01.03.2018, p. 40-55.

Research output: Contribution to journalArticle

Zhang, Yu ; Cao, Sha ; Zhao, Jing ; Alsaihati, Burair ; Ma, Qin ; Zhang, Chi. / MRHCA : a nonparametric statistics based method for hub and co-expression module identification in large gene co-expression network. In: Quantitative Biology. 2018 ; Vol. 6, No. 1. pp. 40-55.
@article{055284573cd14328afe297f479695013,
title = "MRHCA: a nonparametric statistics based method for hub and co-expression module identification in large gene co-expression network",
abstract = "Background: Gene co-expression and differential co-expression analysis has been increasingly used to study cofunctional and co-regulatory biological mechanisms from large scale transcriptomics data sets. Methods: In this study, we develop a nonparametric approach to identify hub genes and modules in a large coexpression network with low computational and memory cost, namely MRHCA. Results: We have applied the method to simulated transcriptomics data sets and demonstrated MRHCA can accurately identify hub genes and estimate size of co-expression modules. With applying MRHCA and differential coexpression analysis to E. coli and TCGA cancer data, we have identified significant condition specific activated genes in E. coli and distinct gene expression regulatory mechanisms between the cancer types with high copy number variation and small somatic mutations. Conclusion: Our analysis has demonstrated MRHCA can (i) deal with large association networks, (ii) rigorously assess statistical significance for hubs and module sizes, (iii) identify co-expression modules with low associations, (iv) detect small and significant modules, and (v) allow genes to be present in more than one modules, compared with existing methods. [Figure not available: see fulltext.].",
keywords = "algorithm for large scale networks analysis, gene co-expression network, Mutual Rank, statistical significance of gene co-expression",
author = "Yu Zhang and Sha Cao and Jing Zhao and Burair Alsaihati and Qin Ma and Chi Zhang",
year = "2018",
month = "3",
day = "1",
doi = "10.1007/s40484-018-0131-z",
language = "English (US)",
volume = "6",
pages = "40--55",
journal = "Quantitative Biology",
issn = "2095-4689",
publisher = "Higher Education Press",
number = "1",

}

TY - JOUR

T1 - MRHCA

T2 - a nonparametric statistics based method for hub and co-expression module identification in large gene co-expression network

AU - Zhang, Yu

AU - Cao, Sha

AU - Zhao, Jing

AU - Alsaihati, Burair

AU - Ma, Qin

AU - Zhang, Chi

PY - 2018/3/1

Y1 - 2018/3/1

N2 - Background: Gene co-expression and differential co-expression analysis has been increasingly used to study cofunctional and co-regulatory biological mechanisms from large scale transcriptomics data sets. Methods: In this study, we develop a nonparametric approach to identify hub genes and modules in a large coexpression network with low computational and memory cost, namely MRHCA. Results: We have applied the method to simulated transcriptomics data sets and demonstrated MRHCA can accurately identify hub genes and estimate size of co-expression modules. With applying MRHCA and differential coexpression analysis to E. coli and TCGA cancer data, we have identified significant condition specific activated genes in E. coli and distinct gene expression regulatory mechanisms between the cancer types with high copy number variation and small somatic mutations. Conclusion: Our analysis has demonstrated MRHCA can (i) deal with large association networks, (ii) rigorously assess statistical significance for hubs and module sizes, (iii) identify co-expression modules with low associations, (iv) detect small and significant modules, and (v) allow genes to be present in more than one modules, compared with existing methods. [Figure not available: see fulltext.].

AB - Background: Gene co-expression and differential co-expression analysis has been increasingly used to study cofunctional and co-regulatory biological mechanisms from large scale transcriptomics data sets. Methods: In this study, we develop a nonparametric approach to identify hub genes and modules in a large coexpression network with low computational and memory cost, namely MRHCA. Results: We have applied the method to simulated transcriptomics data sets and demonstrated MRHCA can accurately identify hub genes and estimate size of co-expression modules. With applying MRHCA and differential coexpression analysis to E. coli and TCGA cancer data, we have identified significant condition specific activated genes in E. coli and distinct gene expression regulatory mechanisms between the cancer types with high copy number variation and small somatic mutations. Conclusion: Our analysis has demonstrated MRHCA can (i) deal with large association networks, (ii) rigorously assess statistical significance for hubs and module sizes, (iii) identify co-expression modules with low associations, (iv) detect small and significant modules, and (v) allow genes to be present in more than one modules, compared with existing methods. [Figure not available: see fulltext.].

KW - algorithm for large scale networks analysis

KW - gene co-expression network

KW - Mutual Rank

KW - statistical significance of gene co-expression

UR - http://www.scopus.com/inward/record.url?scp=85044123468&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85044123468&partnerID=8YFLogxK

U2 - 10.1007/s40484-018-0131-z

DO - 10.1007/s40484-018-0131-z

M3 - Article

AN - SCOPUS:85044123468

VL - 6

SP - 40

EP - 55

JO - Quantitative Biology

JF - Quantitative Biology

SN - 2095-4689

IS - 1

ER -