Mining massive SNP data for identifying associated SNPs and uncovering gene relationships

Amy Webb, Aaron Albin, Zhan Ye, Majid Rastegar-Mojarad, Kun Huang, Jeffrey Parvin, Wolfgang Sadee, Lang Li, Simon Lin, Yang Xiang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Studies on SNP correlations have been focused on SNPs located on the same chromosome since SNPs on different chromosomes are expected to segregate randomly. Previous studies suggest that SNPs can be associated with each other over long distances and even across different chromosomes. To facilitate the study of SNP associations, our goal is to find SNPs that coexist in a significant number of samples regardless of their genomic distance, and subsequently to study the relationships among these associated SNPs and corresponding genes. This problem of mining co-occurrent SNP associations is computationally challenging and motivates us to design an efficient data mining algorithm FCIRC to mine SNP associations from massive SNP data. By applying our method on the original SNP data and random chromosome permutation data, we demonstrate that our method is able to find non-random SNP associations across multiple chromosomes. Among the large amount of associated SNPs identified by our method, many of them involve multiple chromosomes. Some SNP associations also suggest novel relationships among the corresponding genes, and some may imply biological and disease mechanisms related to corresponding genes.

Original languageEnglish (US)
Title of host publicationACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
PublisherAssociation for Computing Machinery, Inc
Pages304-313
Number of pages10
ISBN (Electronic)9781450328944
DOIs
StatePublished - Sep 20 2014
Event5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014 - Newport Beach, United States
Duration: Sep 20 2014Sep 23 2014

Publication series

NameACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

Other

Other5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014
CountryUnited States
CityNewport Beach
Period9/20/149/23/14

ASJC Scopus subject areas

  • Health Informatics
  • Computer Science Applications
  • Software
  • Biomedical Engineering

Fingerprint Dive into the research topics of 'Mining massive SNP data for identifying associated SNPs and uncovering gene relationships'. Together they form a unique fingerprint.

  • Cite this

    Webb, A., Albin, A., Ye, Z., Rastegar-Mojarad, M., Huang, K., Parvin, J., Sadee, W., Li, L., Lin, S., & Xiang, Y. (2014). Mining massive SNP data for identifying associated SNPs and uncovering gene relationships. In ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (pp. 304-313). (ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics). Association for Computing Machinery, Inc. https://doi.org/10.1145/2649387.264939