Impact of human pathogenic micro-insertions and micro-deletions on post-transcriptional regulation

Xinjun Zhang, Hai Lin, Huiying Zhao, Yangyang Hao, Matthew Mort, David N. Cooper, Yaoqi Zhou, Yunlong Liu

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

Small insertions/deletions (INDELs) of ≤21 bp comprise 18%of all recorded mutations causing human inherited disease and are evident in 24%of documentedMendelian diseases. INDELs affect gene function inmultipleways: forexample,byintroducingprematurestopcodonsthateither leadtotheproductionof truncatedproteinsoraffect transcriptional efficiency.However, themeans bywhich they impact post-transcriptional regulation, including alternative splicing, havenot been fully evaluated. In this study,wecollate disease-causingINDELs fromtheHuman Gene Mutation Database (HGMD) and neutral INDELs from the 1000 Genomes Project. The potential of these two types of INDELs to affect binding-site affinity of RNA-binding proteins (RBPs) was then evaluated. We identified several sequence features that can distinguish disease-causing INDELs from neutral INDELs. Moreover, we built a machine-learning predictor called PinPor (predicting pathogenic small insertions and deletions affecting post-transcriptional regulation, http://watson.compbio.iupui.edu/pinpor/) to ascertain which newly observed INDELs are likely to be pathogenic. Our results show that disease-causing INDELs are more likely to ablate RBP-binding sites and tend to affectmore RBP-binding sites than neutral INDELs. Additionally, disease-causing INDELs give rise to greater deviations in binding affinity than neutral INDELs.We also demonstrated that diseasecausing INDELsmay be distinguished fromneutral INDELs by several sequence features, such as their proximity tosplice sites andtheirpotential effectsonRNAsecondary structure.Thispredictor showedsatisfactoryperformance in identifying numerous pathogenic INDELs, with aMatthews correlation coefficient (MCC) value of 0.51 and an accuracy of 0.75.

Original languageEnglish (US)
Article numberddu019
Pages (from-to)3024-3034
Number of pages11
JournalHuman molecular genetics
Volume23
Issue number11
DOIs
StatePublished - Jan 1 2014

Fingerprint

RNA-Binding Proteins
Binding Sites
Protein Binding
Mutation
Sequence Deletion
Gene Deletion
Alternative Splicing
Genome
Databases
Genes

ASJC Scopus subject areas

  • Molecular Biology
  • Genetics
  • Genetics(clinical)

Cite this

Impact of human pathogenic micro-insertions and micro-deletions on post-transcriptional regulation. / Zhang, Xinjun; Lin, Hai; Zhao, Huiying; Hao, Yangyang; Mort, Matthew; Cooper, David N.; Zhou, Yaoqi; Liu, Yunlong.

In: Human molecular genetics, Vol. 23, No. 11, ddu019, 01.01.2014, p. 3024-3034.

Research output: Contribution to journalArticle

Zhang, X, Lin, H, Zhao, H, Hao, Y, Mort, M, Cooper, DN, Zhou, Y & Liu, Y 2014, 'Impact of human pathogenic micro-insertions and micro-deletions on post-transcriptional regulation', Human molecular genetics, vol. 23, no. 11, ddu019, pp. 3024-3034. https://doi.org/10.1093/hmg/ddu019
Zhang, Xinjun ; Lin, Hai ; Zhao, Huiying ; Hao, Yangyang ; Mort, Matthew ; Cooper, David N. ; Zhou, Yaoqi ; Liu, Yunlong. / Impact of human pathogenic micro-insertions and micro-deletions on post-transcriptional regulation. In: Human molecular genetics. 2014 ; Vol. 23, No. 11. pp. 3024-3034.
@article{935aca7e7f89496d9ea10596f6eba142,
title = "Impact of human pathogenic micro-insertions and micro-deletions on post-transcriptional regulation",
abstract = "Small insertions/deletions (INDELs) of ≤21 bp comprise 18{\%}of all recorded mutations causing human inherited disease and are evident in 24{\%}of documentedMendelian diseases. INDELs affect gene function inmultipleways: forexample,byintroducingprematurestopcodonsthateither leadtotheproductionof truncatedproteinsoraffect transcriptional efficiency.However, themeans bywhich they impact post-transcriptional regulation, including alternative splicing, havenot been fully evaluated. In this study,wecollate disease-causingINDELs fromtheHuman Gene Mutation Database (HGMD) and neutral INDELs from the 1000 Genomes Project. The potential of these two types of INDELs to affect binding-site affinity of RNA-binding proteins (RBPs) was then evaluated. We identified several sequence features that can distinguish disease-causing INDELs from neutral INDELs. Moreover, we built a machine-learning predictor called PinPor (predicting pathogenic small insertions and deletions affecting post-transcriptional regulation, http://watson.compbio.iupui.edu/pinpor/) to ascertain which newly observed INDELs are likely to be pathogenic. Our results show that disease-causing INDELs are more likely to ablate RBP-binding sites and tend to affectmore RBP-binding sites than neutral INDELs. Additionally, disease-causing INDELs give rise to greater deviations in binding affinity than neutral INDELs.We also demonstrated that diseasecausing INDELsmay be distinguished fromneutral INDELs by several sequence features, such as their proximity tosplice sites andtheirpotential effectsonRNAsecondary structure.Thispredictor showedsatisfactoryperformance in identifying numerous pathogenic INDELs, with aMatthews correlation coefficient (MCC) value of 0.51 and an accuracy of 0.75.",
author = "Xinjun Zhang and Hai Lin and Huiying Zhao and Yangyang Hao and Matthew Mort and Cooper, {David N.} and Yaoqi Zhou and Yunlong Liu",
year = "2014",
month = "1",
day = "1",
doi = "10.1093/hmg/ddu019",
language = "English (US)",
volume = "23",
pages = "3024--3034",
journal = "Human Molecular Genetics",
issn = "0964-6906",
publisher = "Oxford University Press",
number = "11",

}

TY - JOUR

T1 - Impact of human pathogenic micro-insertions and micro-deletions on post-transcriptional regulation

AU - Zhang, Xinjun

AU - Lin, Hai

AU - Zhao, Huiying

AU - Hao, Yangyang

AU - Mort, Matthew

AU - Cooper, David N.

AU - Zhou, Yaoqi

AU - Liu, Yunlong

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Small insertions/deletions (INDELs) of ≤21 bp comprise 18%of all recorded mutations causing human inherited disease and are evident in 24%of documentedMendelian diseases. INDELs affect gene function inmultipleways: forexample,byintroducingprematurestopcodonsthateither leadtotheproductionof truncatedproteinsoraffect transcriptional efficiency.However, themeans bywhich they impact post-transcriptional regulation, including alternative splicing, havenot been fully evaluated. In this study,wecollate disease-causingINDELs fromtheHuman Gene Mutation Database (HGMD) and neutral INDELs from the 1000 Genomes Project. The potential of these two types of INDELs to affect binding-site affinity of RNA-binding proteins (RBPs) was then evaluated. We identified several sequence features that can distinguish disease-causing INDELs from neutral INDELs. Moreover, we built a machine-learning predictor called PinPor (predicting pathogenic small insertions and deletions affecting post-transcriptional regulation, http://watson.compbio.iupui.edu/pinpor/) to ascertain which newly observed INDELs are likely to be pathogenic. Our results show that disease-causing INDELs are more likely to ablate RBP-binding sites and tend to affectmore RBP-binding sites than neutral INDELs. Additionally, disease-causing INDELs give rise to greater deviations in binding affinity than neutral INDELs.We also demonstrated that diseasecausing INDELsmay be distinguished fromneutral INDELs by several sequence features, such as their proximity tosplice sites andtheirpotential effectsonRNAsecondary structure.Thispredictor showedsatisfactoryperformance in identifying numerous pathogenic INDELs, with aMatthews correlation coefficient (MCC) value of 0.51 and an accuracy of 0.75.

AB - Small insertions/deletions (INDELs) of ≤21 bp comprise 18%of all recorded mutations causing human inherited disease and are evident in 24%of documentedMendelian diseases. INDELs affect gene function inmultipleways: forexample,byintroducingprematurestopcodonsthateither leadtotheproductionof truncatedproteinsoraffect transcriptional efficiency.However, themeans bywhich they impact post-transcriptional regulation, including alternative splicing, havenot been fully evaluated. In this study,wecollate disease-causingINDELs fromtheHuman Gene Mutation Database (HGMD) and neutral INDELs from the 1000 Genomes Project. The potential of these two types of INDELs to affect binding-site affinity of RNA-binding proteins (RBPs) was then evaluated. We identified several sequence features that can distinguish disease-causing INDELs from neutral INDELs. Moreover, we built a machine-learning predictor called PinPor (predicting pathogenic small insertions and deletions affecting post-transcriptional regulation, http://watson.compbio.iupui.edu/pinpor/) to ascertain which newly observed INDELs are likely to be pathogenic. Our results show that disease-causing INDELs are more likely to ablate RBP-binding sites and tend to affectmore RBP-binding sites than neutral INDELs. Additionally, disease-causing INDELs give rise to greater deviations in binding affinity than neutral INDELs.We also demonstrated that diseasecausing INDELsmay be distinguished fromneutral INDELs by several sequence features, such as their proximity tosplice sites andtheirpotential effectsonRNAsecondary structure.Thispredictor showedsatisfactoryperformance in identifying numerous pathogenic INDELs, with aMatthews correlation coefficient (MCC) value of 0.51 and an accuracy of 0.75.

UR - http://www.scopus.com/inward/record.url?scp=84899964765&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84899964765&partnerID=8YFLogxK

U2 - 10.1093/hmg/ddu019

DO - 10.1093/hmg/ddu019

M3 - Article

C2 - 24436305

AN - SCOPUS:84899964765

VL - 23

SP - 3024

EP - 3034

JO - Human Molecular Genetics

JF - Human Molecular Genetics

SN - 0964-6906

IS - 11

M1 - ddu019

ER -