A Deep Learning Approach Based on Stacked Denoising Autoencoders for Protein Function Prediction

Lester James Miranda, Takayuki Furuzuki

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Predicting protein functions is a fundamental task with applications in medicine and healthcare. However, the accelerating pace of protein-discovery renders slow and expensive biochemical techniques unsustainable. Machine learning is suitable for such data-intensive task, but the presence of noise in protein datasets adds another level of difficulty. Hence, we propose a deep learning system based on a stacked denoising autoencoder that extracts robust features to improve predictive performance. We then feed the resulting features to a multilabel support-vector machine for classification. We evaluated on two protein benchmarks, and experimental results show that our system consistently produced the best performance against techniques that do not have a denoising or feature learning capability. This research demonstrates that learning robust representations from raw data can benefit the process of predicting protein functions.

Original languageEnglish
Title of host publicationProceedings - 2018 IEEE 42nd Annual Computer Software and Applications Conference, COMPSAC 2018
EditorsChung-Horng Lung, Thomas Conte, Ling Liu, Toyokazu Akiyama, Kamrul Hasan, Edmundo Tovar, Hiroki Takakura, William Claycomb, Stelvio Cimato, Ji-Jiang Yang, Zhiyong Zhang, Sheikh Iqbal Ahamed, Sorel Reisman, Claudio Demartini, Motonori Nakamura
PublisherIEEE Computer Society
Pages480-485
Number of pages6
Volume1
ISBN (Electronic)9781538626665
DOIs
Publication statusPublished - 2018 Jun 8
Event42nd IEEE Computer Software and Applications Conference, COMPSAC 2018 - Tokyo, Japan
Duration: 2018 Jul 232018 Jul 27

Other

Other42nd IEEE Computer Software and Applications Conference, COMPSAC 2018
CountryJapan
CityTokyo
Period18/7/2318/7/27

Fingerprint

Proteins
Learning systems
Medicine
Support vector machines
Deep learning

Keywords

  • Artificial intelligence
  • Bioinformatics
  • Feature extraction
  • Machine learning
  • Medical computing
  • Multi-label classification

ASJC Scopus subject areas

  • Software
  • Computer Science Applications

Cite this

Miranda, L. J., & Furuzuki, T. (2018). A Deep Learning Approach Based on Stacked Denoising Autoencoders for Protein Function Prediction. In C-H. Lung, T. Conte, L. Liu, T. Akiyama, K. Hasan, E. Tovar, H. Takakura, W. Claycomb, S. Cimato, J-J. Yang, Z. Zhang, S. I. Ahamed, S. Reisman, C. Demartini, ... M. Nakamura (Eds.), Proceedings - 2018 IEEE 42nd Annual Computer Software and Applications Conference, COMPSAC 2018 (Vol. 1, pp. 480-485). [8377699] IEEE Computer Society. https://doi.org/10.1109/COMPSAC.2018.00074

A Deep Learning Approach Based on Stacked Denoising Autoencoders for Protein Function Prediction. / Miranda, Lester James; Furuzuki, Takayuki.

Proceedings - 2018 IEEE 42nd Annual Computer Software and Applications Conference, COMPSAC 2018. ed. / Chung-Horng Lung; Thomas Conte; Ling Liu; Toyokazu Akiyama; Kamrul Hasan; Edmundo Tovar; Hiroki Takakura; William Claycomb; Stelvio Cimato; Ji-Jiang Yang; Zhiyong Zhang; Sheikh Iqbal Ahamed; Sorel Reisman; Claudio Demartini; Motonori Nakamura. Vol. 1 IEEE Computer Society, 2018. p. 480-485 8377699.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Miranda, LJ & Furuzuki, T 2018, A Deep Learning Approach Based on Stacked Denoising Autoencoders for Protein Function Prediction. in C-H Lung, T Conte, L Liu, T Akiyama, K Hasan, E Tovar, H Takakura, W Claycomb, S Cimato, J-J Yang, Z Zhang, SI Ahamed, S Reisman, C Demartini & M Nakamura (eds), Proceedings - 2018 IEEE 42nd Annual Computer Software and Applications Conference, COMPSAC 2018. vol. 1, 8377699, IEEE Computer Society, pp. 480-485, 42nd IEEE Computer Software and Applications Conference, COMPSAC 2018, Tokyo, Japan, 18/7/23. https://doi.org/10.1109/COMPSAC.2018.00074
Miranda LJ, Furuzuki T. A Deep Learning Approach Based on Stacked Denoising Autoencoders for Protein Function Prediction. In Lung C-H, Conte T, Liu L, Akiyama T, Hasan K, Tovar E, Takakura H, Claycomb W, Cimato S, Yang J-J, Zhang Z, Ahamed SI, Reisman S, Demartini C, Nakamura M, editors, Proceedings - 2018 IEEE 42nd Annual Computer Software and Applications Conference, COMPSAC 2018. Vol. 1. IEEE Computer Society. 2018. p. 480-485. 8377699 https://doi.org/10.1109/COMPSAC.2018.00074
Miranda, Lester James ; Furuzuki, Takayuki. / A Deep Learning Approach Based on Stacked Denoising Autoencoders for Protein Function Prediction. Proceedings - 2018 IEEE 42nd Annual Computer Software and Applications Conference, COMPSAC 2018. editor / Chung-Horng Lung ; Thomas Conte ; Ling Liu ; Toyokazu Akiyama ; Kamrul Hasan ; Edmundo Tovar ; Hiroki Takakura ; William Claycomb ; Stelvio Cimato ; Ji-Jiang Yang ; Zhiyong Zhang ; Sheikh Iqbal Ahamed ; Sorel Reisman ; Claudio Demartini ; Motonori Nakamura. Vol. 1 IEEE Computer Society, 2018. pp. 480-485
@inproceedings{2e996771ee05481486413438077575bd,
title = "A Deep Learning Approach Based on Stacked Denoising Autoencoders for Protein Function Prediction",
abstract = "Predicting protein functions is a fundamental task with applications in medicine and healthcare. However, the accelerating pace of protein-discovery renders slow and expensive biochemical techniques unsustainable. Machine learning is suitable for such data-intensive task, but the presence of noise in protein datasets adds another level of difficulty. Hence, we propose a deep learning system based on a stacked denoising autoencoder that extracts robust features to improve predictive performance. We then feed the resulting features to a multilabel support-vector machine for classification. We evaluated on two protein benchmarks, and experimental results show that our system consistently produced the best performance against techniques that do not have a denoising or feature learning capability. This research demonstrates that learning robust representations from raw data can benefit the process of predicting protein functions.",
keywords = "Artificial intelligence, Bioinformatics, Feature extraction, Machine learning, Medical computing, Multi-label classification",
author = "Miranda, {Lester James} and Takayuki Furuzuki",
year = "2018",
month = "6",
day = "8",
doi = "10.1109/COMPSAC.2018.00074",
language = "English",
volume = "1",
pages = "480--485",
editor = "Chung-Horng Lung and Thomas Conte and Ling Liu and Toyokazu Akiyama and Kamrul Hasan and Edmundo Tovar and Hiroki Takakura and William Claycomb and Stelvio Cimato and Ji-Jiang Yang and Zhiyong Zhang and Ahamed, {Sheikh Iqbal} and Sorel Reisman and Claudio Demartini and Motonori Nakamura",
booktitle = "Proceedings - 2018 IEEE 42nd Annual Computer Software and Applications Conference, COMPSAC 2018",
publisher = "IEEE Computer Society",

}

TY - GEN

T1 - A Deep Learning Approach Based on Stacked Denoising Autoencoders for Protein Function Prediction

AU - Miranda, Lester James

AU - Furuzuki, Takayuki

PY - 2018/6/8

Y1 - 2018/6/8

N2 - Predicting protein functions is a fundamental task with applications in medicine and healthcare. However, the accelerating pace of protein-discovery renders slow and expensive biochemical techniques unsustainable. Machine learning is suitable for such data-intensive task, but the presence of noise in protein datasets adds another level of difficulty. Hence, we propose a deep learning system based on a stacked denoising autoencoder that extracts robust features to improve predictive performance. We then feed the resulting features to a multilabel support-vector machine for classification. We evaluated on two protein benchmarks, and experimental results show that our system consistently produced the best performance against techniques that do not have a denoising or feature learning capability. This research demonstrates that learning robust representations from raw data can benefit the process of predicting protein functions.

AB - Predicting protein functions is a fundamental task with applications in medicine and healthcare. However, the accelerating pace of protein-discovery renders slow and expensive biochemical techniques unsustainable. Machine learning is suitable for such data-intensive task, but the presence of noise in protein datasets adds another level of difficulty. Hence, we propose a deep learning system based on a stacked denoising autoencoder that extracts robust features to improve predictive performance. We then feed the resulting features to a multilabel support-vector machine for classification. We evaluated on two protein benchmarks, and experimental results show that our system consistently produced the best performance against techniques that do not have a denoising or feature learning capability. This research demonstrates that learning robust representations from raw data can benefit the process of predicting protein functions.

KW - Artificial intelligence

KW - Bioinformatics

KW - Feature extraction

KW - Machine learning

KW - Medical computing

KW - Multi-label classification

UR - http://www.scopus.com/inward/record.url?scp=85055449601&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85055449601&partnerID=8YFLogxK

U2 - 10.1109/COMPSAC.2018.00074

DO - 10.1109/COMPSAC.2018.00074

M3 - Conference contribution

AN - SCOPUS:85055449601

VL - 1

SP - 480

EP - 485

BT - Proceedings - 2018 IEEE 42nd Annual Computer Software and Applications Conference, COMPSAC 2018

A2 - Lung, Chung-Horng

A2 - Conte, Thomas

A2 - Liu, Ling

A2 - Akiyama, Toyokazu

A2 - Hasan, Kamrul

A2 - Tovar, Edmundo

A2 - Takakura, Hiroki

A2 - Claycomb, William

A2 - Cimato, Stelvio

A2 - Yang, Ji-Jiang

A2 - Zhang, Zhiyong

A2 - Ahamed, Sheikh Iqbal

A2 - Reisman, Sorel

A2 - Demartini, Claudio

A2 - Nakamura, Motonori

PB - IEEE Computer Society

ER -