MildInt: Deep learning-based multimodal longitudinal data integration framework

Garam Lee, Byungkon Kang, Kwangsik Nho, Kyung Ah Sohn, Dokyoon Kim

Research output: Contribution to journalArticle

Abstract

As large amounts of heterogeneous biomedical data become available, numerous methods for integrating such datasets have been developed to extract complementary knowledge from multiple domains of sources. Recently, a deep learning approach has shown promising results in a variety of research areas. However, applying the deep learning approach requires expertise for constructing a deep architecture that can take multimodal longitudinal data. Thus, in this paper, a deep learning-based python package for data integration is developed. The python package deep learning-based multimodal longitudinal data integration framework (MildInt) provides the preconstructed deep learning architecture for a classification task. MildInt contains two learning phases: learning feature representation from each modality of data and training a classifier for the final decision. Adopting deep architecture in the first phase leads to learning more task-relevant feature representation than a linear model. In the second phase, linear regression classifier is used for detecting and investigating biomarkers from multimodal data. Thus, by combining the linear model and the deep learning model, higher accuracy and better interpretability can be achieved. We validated the performance of our package using simulation data and real data. For the real data, as a pilot study, we used clinical and multimodal neuroimaging datasets in Alzheimer's disease to predict the disease progression. MildInt is capable of integrating multiple forms of numerical data including time series and non-time series data for extracting complementary features from the multimodal dataset.

Original languageEnglish (US)
Article number617
JournalFrontiers in Genetics
Volume10
Issue numberJUN
DOIs
StatePublished - Jan 1 2019

Fingerprint

Learning
Boidae
Linear Models
Neuroimaging
Disease Progression
Alzheimer Disease
Biomarkers
Research
Datasets

Keywords

  • Alzheimer's disease
  • Data integration
  • Gated recurrent unit
  • Multimodal deep learning
  • Python package

ASJC Scopus subject areas

  • Molecular Medicine
  • Genetics
  • Genetics(clinical)

Cite this

MildInt : Deep learning-based multimodal longitudinal data integration framework. / Lee, Garam; Kang, Byungkon; Nho, Kwangsik; Sohn, Kyung Ah; Kim, Dokyoon.

In: Frontiers in Genetics, Vol. 10, No. JUN, 617, 01.01.2019.

Research output: Contribution to journalArticle

Lee, Garam ; Kang, Byungkon ; Nho, Kwangsik ; Sohn, Kyung Ah ; Kim, Dokyoon. / MildInt : Deep learning-based multimodal longitudinal data integration framework. In: Frontiers in Genetics. 2019 ; Vol. 10, No. JUN.
@article{35e43a5c75274f71a704f657586d57d6,
title = "MildInt: Deep learning-based multimodal longitudinal data integration framework",
abstract = "As large amounts of heterogeneous biomedical data become available, numerous methods for integrating such datasets have been developed to extract complementary knowledge from multiple domains of sources. Recently, a deep learning approach has shown promising results in a variety of research areas. However, applying the deep learning approach requires expertise for constructing a deep architecture that can take multimodal longitudinal data. Thus, in this paper, a deep learning-based python package for data integration is developed. The python package deep learning-based multimodal longitudinal data integration framework (MildInt) provides the preconstructed deep learning architecture for a classification task. MildInt contains two learning phases: learning feature representation from each modality of data and training a classifier for the final decision. Adopting deep architecture in the first phase leads to learning more task-relevant feature representation than a linear model. In the second phase, linear regression classifier is used for detecting and investigating biomarkers from multimodal data. Thus, by combining the linear model and the deep learning model, higher accuracy and better interpretability can be achieved. We validated the performance of our package using simulation data and real data. For the real data, as a pilot study, we used clinical and multimodal neuroimaging datasets in Alzheimer's disease to predict the disease progression. MildInt is capable of integrating multiple forms of numerical data including time series and non-time series data for extracting complementary features from the multimodal dataset.",
keywords = "Alzheimer's disease, Data integration, Gated recurrent unit, Multimodal deep learning, Python package",
author = "Garam Lee and Byungkon Kang and Kwangsik Nho and Sohn, {Kyung Ah} and Dokyoon Kim",
year = "2019",
month = "1",
day = "1",
doi = "10.3389/fgene.2019.00617",
language = "English (US)",
volume = "10",
journal = "Frontiers in Genetics",
issn = "1664-8021",
publisher = "Frontiers Media S. A.",
number = "JUN",

}

TY - JOUR

T1 - MildInt

T2 - Deep learning-based multimodal longitudinal data integration framework

AU - Lee, Garam

AU - Kang, Byungkon

AU - Nho, Kwangsik

AU - Sohn, Kyung Ah

AU - Kim, Dokyoon

PY - 2019/1/1

Y1 - 2019/1/1

N2 - As large amounts of heterogeneous biomedical data become available, numerous methods for integrating such datasets have been developed to extract complementary knowledge from multiple domains of sources. Recently, a deep learning approach has shown promising results in a variety of research areas. However, applying the deep learning approach requires expertise for constructing a deep architecture that can take multimodal longitudinal data. Thus, in this paper, a deep learning-based python package for data integration is developed. The python package deep learning-based multimodal longitudinal data integration framework (MildInt) provides the preconstructed deep learning architecture for a classification task. MildInt contains two learning phases: learning feature representation from each modality of data and training a classifier for the final decision. Adopting deep architecture in the first phase leads to learning more task-relevant feature representation than a linear model. In the second phase, linear regression classifier is used for detecting and investigating biomarkers from multimodal data. Thus, by combining the linear model and the deep learning model, higher accuracy and better interpretability can be achieved. We validated the performance of our package using simulation data and real data. For the real data, as a pilot study, we used clinical and multimodal neuroimaging datasets in Alzheimer's disease to predict the disease progression. MildInt is capable of integrating multiple forms of numerical data including time series and non-time series data for extracting complementary features from the multimodal dataset.

AB - As large amounts of heterogeneous biomedical data become available, numerous methods for integrating such datasets have been developed to extract complementary knowledge from multiple domains of sources. Recently, a deep learning approach has shown promising results in a variety of research areas. However, applying the deep learning approach requires expertise for constructing a deep architecture that can take multimodal longitudinal data. Thus, in this paper, a deep learning-based python package for data integration is developed. The python package deep learning-based multimodal longitudinal data integration framework (MildInt) provides the preconstructed deep learning architecture for a classification task. MildInt contains two learning phases: learning feature representation from each modality of data and training a classifier for the final decision. Adopting deep architecture in the first phase leads to learning more task-relevant feature representation than a linear model. In the second phase, linear regression classifier is used for detecting and investigating biomarkers from multimodal data. Thus, by combining the linear model and the deep learning model, higher accuracy and better interpretability can be achieved. We validated the performance of our package using simulation data and real data. For the real data, as a pilot study, we used clinical and multimodal neuroimaging datasets in Alzheimer's disease to predict the disease progression. MildInt is capable of integrating multiple forms of numerical data including time series and non-time series data for extracting complementary features from the multimodal dataset.

KW - Alzheimer's disease

KW - Data integration

KW - Gated recurrent unit

KW - Multimodal deep learning

KW - Python package

UR - http://www.scopus.com/inward/record.url?scp=85069038144&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85069038144&partnerID=8YFLogxK

U2 - 10.3389/fgene.2019.00617

DO - 10.3389/fgene.2019.00617

M3 - Article

AN - SCOPUS:85069038144

VL - 10

JO - Frontiers in Genetics

JF - Frontiers in Genetics

SN - 1664-8021

IS - JUN

M1 - 617

ER -