Accelerating sparse canonical correlation analysis for large brain imaging genetics data

Jingwen Yan, Hui Zhang, Lei Du, Eric Wernert, Andrew Saykin, Li Shen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Recent advances in acquiring high throughput neuroimaging and genomics data provide exciting new opportunities to study the influence of genetic variation on brain structure and function. Research in this emergent field, known as imaging genetics, aims to identify the association between genetic variations such as single nucleotide polymorphisms (SNPs) and neuroimaging quantitative traits (QTs). Sparse canonical correlation analysis (SCCA) is a bi-multivariate analysis method that has the potential to reveal complex multi-SNP-multi-QT associations. However, the scale and complexity of the imaging genetic data have presented critical computational bottlenecks requiring new concepts and enabling tools. In this paper, we present our initial efforts on developing a set of massively parallel strategies to accelerate a widely used SCCA implementation provided by the Penalized Multivariate Analysis (PMA) software package. In particular, we exploit parallel packages of R, optimized mathematical libraries, and the automatic offload model for Intel Many Integrated Core (MIC) architecture to accelerate SCCA. We create several simulated imaging genetics data sets of different sizes and use these synthetic data to perform comparative study. Our performance evaluation demonstrates that a 2-fold speedup can be achieved by the proposed acceleration. The preliminary results show that by combining data parallel strategy and the offload model for MIC we can significantly reduce the knowledge discovery timelines involving applying SCCA on large brain imaging genetics data.

Original languageEnglish
Title of host publicationACM International Conference Proceeding Series
PublisherAssociation for Computing Machinery
ISBN (Print)9781450328937
DOIs
StatePublished - 2014
Event2014 Annual Conference on Extreme Science and Engineering Discovery Environment, XSEDE 2014 - Atlanta, GA, United States
Duration: Jul 13 2014Jul 18 2014

Other

Other2014 Annual Conference on Extreme Science and Engineering Discovery Environment, XSEDE 2014
CountryUnited States
CityAtlanta, GA
Period7/13/147/18/14

Fingerprint

Brain
Neuroimaging
Imaging techniques
Nucleotides
Polymorphism
Software packages
Data mining
Throughput
Genetics
Multivariate Analysis

Keywords

  • Brain imaging genetics
  • Parallel computing
  • R
  • Sparse canonical correlation analysis

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Cite this

Yan, J., Zhang, H., Du, L., Wernert, E., Saykin, A., & Shen, L. (2014). Accelerating sparse canonical correlation analysis for large brain imaging genetics data. In ACM International Conference Proceeding Series [4] Association for Computing Machinery. https://doi.org/10.1145/2616498.2616515

Accelerating sparse canonical correlation analysis for large brain imaging genetics data. / Yan, Jingwen; Zhang, Hui; Du, Lei; Wernert, Eric; Saykin, Andrew; Shen, Li.

ACM International Conference Proceeding Series. Association for Computing Machinery, 2014. 4.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yan, J, Zhang, H, Du, L, Wernert, E, Saykin, A & Shen, L 2014, Accelerating sparse canonical correlation analysis for large brain imaging genetics data. in ACM International Conference Proceeding Series., 4, Association for Computing Machinery, 2014 Annual Conference on Extreme Science and Engineering Discovery Environment, XSEDE 2014, Atlanta, GA, United States, 7/13/14. https://doi.org/10.1145/2616498.2616515
Yan J, Zhang H, Du L, Wernert E, Saykin A, Shen L. Accelerating sparse canonical correlation analysis for large brain imaging genetics data. In ACM International Conference Proceeding Series. Association for Computing Machinery. 2014. 4 https://doi.org/10.1145/2616498.2616515
Yan, Jingwen ; Zhang, Hui ; Du, Lei ; Wernert, Eric ; Saykin, Andrew ; Shen, Li. / Accelerating sparse canonical correlation analysis for large brain imaging genetics data. ACM International Conference Proceeding Series. Association for Computing Machinery, 2014.
@inproceedings{4cfc2161c7df406f9b0e6d79f7c50b1c,
title = "Accelerating sparse canonical correlation analysis for large brain imaging genetics data",
abstract = "Recent advances in acquiring high throughput neuroimaging and genomics data provide exciting new opportunities to study the influence of genetic variation on brain structure and function. Research in this emergent field, known as imaging genetics, aims to identify the association between genetic variations such as single nucleotide polymorphisms (SNPs) and neuroimaging quantitative traits (QTs). Sparse canonical correlation analysis (SCCA) is a bi-multivariate analysis method that has the potential to reveal complex multi-SNP-multi-QT associations. However, the scale and complexity of the imaging genetic data have presented critical computational bottlenecks requiring new concepts and enabling tools. In this paper, we present our initial efforts on developing a set of massively parallel strategies to accelerate a widely used SCCA implementation provided by the Penalized Multivariate Analysis (PMA) software package. In particular, we exploit parallel packages of R, optimized mathematical libraries, and the automatic offload model for Intel Many Integrated Core (MIC) architecture to accelerate SCCA. We create several simulated imaging genetics data sets of different sizes and use these synthetic data to perform comparative study. Our performance evaluation demonstrates that a 2-fold speedup can be achieved by the proposed acceleration. The preliminary results show that by combining data parallel strategy and the offload model for MIC we can significantly reduce the knowledge discovery timelines involving applying SCCA on large brain imaging genetics data.",
keywords = "Brain imaging genetics, Parallel computing, R, Sparse canonical correlation analysis",
author = "Jingwen Yan and Hui Zhang and Lei Du and Eric Wernert and Andrew Saykin and Li Shen",
year = "2014",
doi = "10.1145/2616498.2616515",
language = "English",
isbn = "9781450328937",
booktitle = "ACM International Conference Proceeding Series",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Accelerating sparse canonical correlation analysis for large brain imaging genetics data

AU - Yan, Jingwen

AU - Zhang, Hui

AU - Du, Lei

AU - Wernert, Eric

AU - Saykin, Andrew

AU - Shen, Li

PY - 2014

Y1 - 2014

N2 - Recent advances in acquiring high throughput neuroimaging and genomics data provide exciting new opportunities to study the influence of genetic variation on brain structure and function. Research in this emergent field, known as imaging genetics, aims to identify the association between genetic variations such as single nucleotide polymorphisms (SNPs) and neuroimaging quantitative traits (QTs). Sparse canonical correlation analysis (SCCA) is a bi-multivariate analysis method that has the potential to reveal complex multi-SNP-multi-QT associations. However, the scale and complexity of the imaging genetic data have presented critical computational bottlenecks requiring new concepts and enabling tools. In this paper, we present our initial efforts on developing a set of massively parallel strategies to accelerate a widely used SCCA implementation provided by the Penalized Multivariate Analysis (PMA) software package. In particular, we exploit parallel packages of R, optimized mathematical libraries, and the automatic offload model for Intel Many Integrated Core (MIC) architecture to accelerate SCCA. We create several simulated imaging genetics data sets of different sizes and use these synthetic data to perform comparative study. Our performance evaluation demonstrates that a 2-fold speedup can be achieved by the proposed acceleration. The preliminary results show that by combining data parallel strategy and the offload model for MIC we can significantly reduce the knowledge discovery timelines involving applying SCCA on large brain imaging genetics data.

AB - Recent advances in acquiring high throughput neuroimaging and genomics data provide exciting new opportunities to study the influence of genetic variation on brain structure and function. Research in this emergent field, known as imaging genetics, aims to identify the association between genetic variations such as single nucleotide polymorphisms (SNPs) and neuroimaging quantitative traits (QTs). Sparse canonical correlation analysis (SCCA) is a bi-multivariate analysis method that has the potential to reveal complex multi-SNP-multi-QT associations. However, the scale and complexity of the imaging genetic data have presented critical computational bottlenecks requiring new concepts and enabling tools. In this paper, we present our initial efforts on developing a set of massively parallel strategies to accelerate a widely used SCCA implementation provided by the Penalized Multivariate Analysis (PMA) software package. In particular, we exploit parallel packages of R, optimized mathematical libraries, and the automatic offload model for Intel Many Integrated Core (MIC) architecture to accelerate SCCA. We create several simulated imaging genetics data sets of different sizes and use these synthetic data to perform comparative study. Our performance evaluation demonstrates that a 2-fold speedup can be achieved by the proposed acceleration. The preliminary results show that by combining data parallel strategy and the offload model for MIC we can significantly reduce the knowledge discovery timelines involving applying SCCA on large brain imaging genetics data.

KW - Brain imaging genetics

KW - Parallel computing

KW - R

KW - Sparse canonical correlation analysis

UR - http://www.scopus.com/inward/record.url?scp=84905493802&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84905493802&partnerID=8YFLogxK

U2 - 10.1145/2616498.2616515

DO - 10.1145/2616498.2616515

M3 - Conference contribution

AN - SCOPUS:84905493802

SN - 9781450328937

BT - ACM International Conference Proceeding Series

PB - Association for Computing Machinery

ER -