RegSNPs-intron: A computational framework for predicting pathogenic impact of intronic single nucleotide variants

Hai Lin, Katherine A. Hargreaves, Rudong Li, Jill L. Reiter, Yue Wang, Matthew Mort, David N. Cooper, Yaoqi Zhou, Chi Zhang, Michael T. Eadon, M. Eileen Dolan, Joseph Ipe, Todd C. Skaar, Yunlong Liu

Research output: Contribution to journalArticle

Abstract

Single nucleotide variants (SNVs) in intronic regions have yet to be systematically investigated for their disease-causing potential. Using known pathogenic and neutral intronic SNVs (iSNVs) as training data, we develop the RegSNPs-intron algorithm based on a random forest classifier that integrates RNA splicing, protein structure, and evolutionary conservation features. RegSNPs-intron showed excellent performance in evaluating the pathogenic impacts of iSNVs. Using a high-throughput functional reporter assay called ASSET-seq (ASsay for Splicing using ExonTrap and sequencing), we evaluate the impact of RegSNPs-intron predictions on splicing outcome. Together, RegSNPs-intron and ASSET-seq enable effective prioritization of iSNVs for disease pathogenesis.

Original languageEnglish (US)
Article number254
JournalGenome Biology
Volume20
Issue number1
DOIs
StatePublished - Nov 28 2019

Fingerprint

Introns
introns
Nucleotides
nucleotides
assay
assays
prioritization
RNA splicing
RNA Splicing
RNA
protein structure
protein
pathogenesis
prediction
Proteins

Keywords

  • Bioinformatics
  • Computational biology
  • Disease pathogenesis
  • High-throughput screening assay
  • Intron
  • Prediction model
  • Random forest
  • RNA splicing
  • Single nucleotide polymorphism

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Genetics
  • Cell Biology

Cite this

RegSNPs-intron : A computational framework for predicting pathogenic impact of intronic single nucleotide variants. / Lin, Hai; Hargreaves, Katherine A.; Li, Rudong; Reiter, Jill L.; Wang, Yue; Mort, Matthew; Cooper, David N.; Zhou, Yaoqi; Zhang, Chi; Eadon, Michael T.; Dolan, M. Eileen; Ipe, Joseph; Skaar, Todd C.; Liu, Yunlong.

In: Genome Biology, Vol. 20, No. 1, 254, 28.11.2019.

Research output: Contribution to journalArticle

Lin, Hai ; Hargreaves, Katherine A. ; Li, Rudong ; Reiter, Jill L. ; Wang, Yue ; Mort, Matthew ; Cooper, David N. ; Zhou, Yaoqi ; Zhang, Chi ; Eadon, Michael T. ; Dolan, M. Eileen ; Ipe, Joseph ; Skaar, Todd C. ; Liu, Yunlong. / RegSNPs-intron : A computational framework for predicting pathogenic impact of intronic single nucleotide variants. In: Genome Biology. 2019 ; Vol. 20, No. 1.
@article{6aecd9be5c8149958c0843b049054387,
title = "RegSNPs-intron: A computational framework for predicting pathogenic impact of intronic single nucleotide variants",
abstract = "Single nucleotide variants (SNVs) in intronic regions have yet to be systematically investigated for their disease-causing potential. Using known pathogenic and neutral intronic SNVs (iSNVs) as training data, we develop the RegSNPs-intron algorithm based on a random forest classifier that integrates RNA splicing, protein structure, and evolutionary conservation features. RegSNPs-intron showed excellent performance in evaluating the pathogenic impacts of iSNVs. Using a high-throughput functional reporter assay called ASSET-seq (ASsay for Splicing using ExonTrap and sequencing), we evaluate the impact of RegSNPs-intron predictions on splicing outcome. Together, RegSNPs-intron and ASSET-seq enable effective prioritization of iSNVs for disease pathogenesis.",
keywords = "Bioinformatics, Computational biology, Disease pathogenesis, High-throughput screening assay, Intron, Prediction model, Random forest, RNA splicing, Single nucleotide polymorphism",
author = "Hai Lin and Hargreaves, {Katherine A.} and Rudong Li and Reiter, {Jill L.} and Yue Wang and Matthew Mort and Cooper, {David N.} and Yaoqi Zhou and Chi Zhang and Eadon, {Michael T.} and Dolan, {M. Eileen} and Joseph Ipe and Skaar, {Todd C.} and Yunlong Liu",
year = "2019",
month = "11",
day = "28",
doi = "10.1186/s13059-019-1847-4",
language = "English (US)",
volume = "20",
journal = "Genome Biology",
issn = "1474-7596",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - RegSNPs-intron

T2 - A computational framework for predicting pathogenic impact of intronic single nucleotide variants

AU - Lin, Hai

AU - Hargreaves, Katherine A.

AU - Li, Rudong

AU - Reiter, Jill L.

AU - Wang, Yue

AU - Mort, Matthew

AU - Cooper, David N.

AU - Zhou, Yaoqi

AU - Zhang, Chi

AU - Eadon, Michael T.

AU - Dolan, M. Eileen

AU - Ipe, Joseph

AU - Skaar, Todd C.

AU - Liu, Yunlong

PY - 2019/11/28

Y1 - 2019/11/28

N2 - Single nucleotide variants (SNVs) in intronic regions have yet to be systematically investigated for their disease-causing potential. Using known pathogenic and neutral intronic SNVs (iSNVs) as training data, we develop the RegSNPs-intron algorithm based on a random forest classifier that integrates RNA splicing, protein structure, and evolutionary conservation features. RegSNPs-intron showed excellent performance in evaluating the pathogenic impacts of iSNVs. Using a high-throughput functional reporter assay called ASSET-seq (ASsay for Splicing using ExonTrap and sequencing), we evaluate the impact of RegSNPs-intron predictions on splicing outcome. Together, RegSNPs-intron and ASSET-seq enable effective prioritization of iSNVs for disease pathogenesis.

AB - Single nucleotide variants (SNVs) in intronic regions have yet to be systematically investigated for their disease-causing potential. Using known pathogenic and neutral intronic SNVs (iSNVs) as training data, we develop the RegSNPs-intron algorithm based on a random forest classifier that integrates RNA splicing, protein structure, and evolutionary conservation features. RegSNPs-intron showed excellent performance in evaluating the pathogenic impacts of iSNVs. Using a high-throughput functional reporter assay called ASSET-seq (ASsay for Splicing using ExonTrap and sequencing), we evaluate the impact of RegSNPs-intron predictions on splicing outcome. Together, RegSNPs-intron and ASSET-seq enable effective prioritization of iSNVs for disease pathogenesis.

KW - Bioinformatics

KW - Computational biology

KW - Disease pathogenesis

KW - High-throughput screening assay

KW - Intron

KW - Prediction model

KW - Random forest

KW - RNA splicing

KW - Single nucleotide polymorphism

UR - http://www.scopus.com/inward/record.url?scp=85075762142&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85075762142&partnerID=8YFLogxK

U2 - 10.1186/s13059-019-1847-4

DO - 10.1186/s13059-019-1847-4

M3 - Article

C2 - 31779641

AN - SCOPUS:85075762142

VL - 20

JO - Genome Biology

JF - Genome Biology

SN - 1474-7596

IS - 1

M1 - 254

ER -