A Hybrid Model for Nonlinear Regression with Missing Data Using Quasilinear Kernel

Huilin Zhu, Yanling Tian, Yanni Ren, Jinglu Hu

Research output: Contribution to journalArticlepeer-review

Abstract

In both the research and engineering fields, missing data is a serious problem that cannot be overlooked. Therefore, available datasets with missing data are a challenge to be modeled by conventional global prediction models. In this paper, we propose a hybrid model consisting of an autoencoder and a gated linear network for solving the regression problem under missing value scenario. A sophisticated modeling and identifying algorithm is developed. First, an extended affinity propagation (AP) clustering algorithm is applied to obtain a self-organized competitive net dividing the datasets into several clusters. Second, a multiple imputation tool with top p% winner-take-all denoising autoencoders (DAE) is introduced to realize better predictions of missing values, in which rough estimates of missing values by using the mean imputation and similarity method within the clusters are used as teacher signals of DAE. Finally, a gated linear network is designed to construct a piecewise linear regression model with interpolations in the exact same way as a support vector regression with a quasilinear kernel composed using the cluster information obtained in the AP clustering step. Based on the experiments of five datasets, our proposed method demonstrates its effectiveness and robustness compared with other traditional kernels and state-of-the-art methods, even on datasets with a large percentage of missing values.

Original languageEnglish
JournalIEEJ Transactions on Electrical and Electronic Engineering
DOIs
Publication statusAccepted/In press - 2020

Keywords

  • affinity propagation algorithm
  • denoising autoencoder
  • missing data
  • nonlinear regression
  • quasilinear kernel
  • support vector regression

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A Hybrid Model for Nonlinear Regression with Missing Data Using Quasilinear Kernel'. Together they form a unique fingerprint.

Cite this