LTMG: a novel statistical modeling of transcriptional expression states in single-cell RNA-Seq data

Changlin Wan, Wennan Chang, Yu Zhang, Fenil Shah, Xiaoyu Lu, Yong Zang, Anru Zhang, Sha Cao, Melissa L. Fishel, Qin Ma, Chi Zhang

Research output: Contribution to journalArticle

Abstract

A key challenge in modeling single-cell RNA-seq data is to capture the diversity of gene expression states regulated by different transcriptional regulatory inputs across individual cells, which is further complicated by largely observed zero and low expressions. We developed a left truncated mixture Gaussian (LTMG) model, from the kinetic relationships of the transcriptional regulatory inputs, mRNA metabolism and abundance in single cells. LTMG infers the expression multi-modalities across single cells, meanwhile, the dropouts and low expressions are treated as left truncated. We demonstrated that LTMG has significantly better goodness of fitting on an extensive number of scRNA-seq data, comparing to three other state-of-the-art models. Our biological assumption of the low non-zero expressions, rationality of the multimodality setting, and the capability of LTMG in extracting expression states specific to cell types or functions, are validated on independent experimental data sets. A differential gene expression test and a co-regulation module identification method are further developed. We experimentally validated that our differential expression test has higher sensitivity and specificity, compared with other five popular methods. The co-regulation analysis is capable of retrieving gene co-regulation modules corresponding to perturbed transcriptional regulations. A user-friendly R package with all the analysis power is available at https://github.com/zy26/LTMGSCA.

Original languageEnglish (US)
Pages (from-to)e111
JournalNucleic acids research
Volume47
Issue number18
DOIs
StatePublished - Oct 10 2019

Fingerprint

RNA
Small Cytoplasmic RNA
Gene Expression
Sensitivity and Specificity
Messenger RNA
Genes

ASJC Scopus subject areas

  • Genetics

Cite this

LTMG : a novel statistical modeling of transcriptional expression states in single-cell RNA-Seq data. / Wan, Changlin; Chang, Wennan; Zhang, Yu; Shah, Fenil; Lu, Xiaoyu; Zang, Yong; Zhang, Anru; Cao, Sha; Fishel, Melissa L.; Ma, Qin; Zhang, Chi.

In: Nucleic acids research, Vol. 47, No. 18, 10.10.2019, p. e111.

Research output: Contribution to journalArticle

Wan, Changlin ; Chang, Wennan ; Zhang, Yu ; Shah, Fenil ; Lu, Xiaoyu ; Zang, Yong ; Zhang, Anru ; Cao, Sha ; Fishel, Melissa L. ; Ma, Qin ; Zhang, Chi. / LTMG : a novel statistical modeling of transcriptional expression states in single-cell RNA-Seq data. In: Nucleic acids research. 2019 ; Vol. 47, No. 18. pp. e111.
@article{e34fc7560f0042bf94c3553a294d3f3b,
title = "LTMG: a novel statistical modeling of transcriptional expression states in single-cell RNA-Seq data",
abstract = "A key challenge in modeling single-cell RNA-seq data is to capture the diversity of gene expression states regulated by different transcriptional regulatory inputs across individual cells, which is further complicated by largely observed zero and low expressions. We developed a left truncated mixture Gaussian (LTMG) model, from the kinetic relationships of the transcriptional regulatory inputs, mRNA metabolism and abundance in single cells. LTMG infers the expression multi-modalities across single cells, meanwhile, the dropouts and low expressions are treated as left truncated. We demonstrated that LTMG has significantly better goodness of fitting on an extensive number of scRNA-seq data, comparing to three other state-of-the-art models. Our biological assumption of the low non-zero expressions, rationality of the multimodality setting, and the capability of LTMG in extracting expression states specific to cell types or functions, are validated on independent experimental data sets. A differential gene expression test and a co-regulation module identification method are further developed. We experimentally validated that our differential expression test has higher sensitivity and specificity, compared with other five popular methods. The co-regulation analysis is capable of retrieving gene co-regulation modules corresponding to perturbed transcriptional regulations. A user-friendly R package with all the analysis power is available at https://github.com/zy26/LTMGSCA.",
author = "Changlin Wan and Wennan Chang and Yu Zhang and Fenil Shah and Xiaoyu Lu and Yong Zang and Anru Zhang and Sha Cao and Fishel, {Melissa L.} and Qin Ma and Chi Zhang",
year = "2019",
month = "10",
day = "10",
doi = "10.1093/nar/gkz655",
language = "English (US)",
volume = "47",
pages = "e111",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "18",

}

TY - JOUR

T1 - LTMG

T2 - a novel statistical modeling of transcriptional expression states in single-cell RNA-Seq data

AU - Wan, Changlin

AU - Chang, Wennan

AU - Zhang, Yu

AU - Shah, Fenil

AU - Lu, Xiaoyu

AU - Zang, Yong

AU - Zhang, Anru

AU - Cao, Sha

AU - Fishel, Melissa L.

AU - Ma, Qin

AU - Zhang, Chi

PY - 2019/10/10

Y1 - 2019/10/10

N2 - A key challenge in modeling single-cell RNA-seq data is to capture the diversity of gene expression states regulated by different transcriptional regulatory inputs across individual cells, which is further complicated by largely observed zero and low expressions. We developed a left truncated mixture Gaussian (LTMG) model, from the kinetic relationships of the transcriptional regulatory inputs, mRNA metabolism and abundance in single cells. LTMG infers the expression multi-modalities across single cells, meanwhile, the dropouts and low expressions are treated as left truncated. We demonstrated that LTMG has significantly better goodness of fitting on an extensive number of scRNA-seq data, comparing to three other state-of-the-art models. Our biological assumption of the low non-zero expressions, rationality of the multimodality setting, and the capability of LTMG in extracting expression states specific to cell types or functions, are validated on independent experimental data sets. A differential gene expression test and a co-regulation module identification method are further developed. We experimentally validated that our differential expression test has higher sensitivity and specificity, compared with other five popular methods. The co-regulation analysis is capable of retrieving gene co-regulation modules corresponding to perturbed transcriptional regulations. A user-friendly R package with all the analysis power is available at https://github.com/zy26/LTMGSCA.

AB - A key challenge in modeling single-cell RNA-seq data is to capture the diversity of gene expression states regulated by different transcriptional regulatory inputs across individual cells, which is further complicated by largely observed zero and low expressions. We developed a left truncated mixture Gaussian (LTMG) model, from the kinetic relationships of the transcriptional regulatory inputs, mRNA metabolism and abundance in single cells. LTMG infers the expression multi-modalities across single cells, meanwhile, the dropouts and low expressions are treated as left truncated. We demonstrated that LTMG has significantly better goodness of fitting on an extensive number of scRNA-seq data, comparing to three other state-of-the-art models. Our biological assumption of the low non-zero expressions, rationality of the multimodality setting, and the capability of LTMG in extracting expression states specific to cell types or functions, are validated on independent experimental data sets. A differential gene expression test and a co-regulation module identification method are further developed. We experimentally validated that our differential expression test has higher sensitivity and specificity, compared with other five popular methods. The co-regulation analysis is capable of retrieving gene co-regulation modules corresponding to perturbed transcriptional regulations. A user-friendly R package with all the analysis power is available at https://github.com/zy26/LTMGSCA.

UR - http://www.scopus.com/inward/record.url?scp=85072718658&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85072718658&partnerID=8YFLogxK

U2 - 10.1093/nar/gkz655

DO - 10.1093/nar/gkz655

M3 - Article

C2 - 31372654

AN - SCOPUS:85072718658

VL - 47

SP - e111

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 18

ER -