TY - JOUR
T1 - Meta-tree random forest
T2 - Probabilistic data-generative model and bayes optimal prediction
AU - Dobashi, Nao
AU - Saito, Shota
AU - Nakahara, Yuta
AU - Matsushima, Toshiyasu
N1 - Funding Information:
Funding: This work was supported in part by JSPS KAKENHI Grant Numbers JP17K06446, JP19K04914 and JP19K14989.
Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2021/6
Y1 - 2021/6
N2 - This paper deals with a prediction problem of a new targeting variable corresponding to a new explanatory variable given a training dataset. To predict the targeting variable, we consider a model tree, which is used to represent a conditional probabilistic structure of a targeting variable given an explanatory variable, and discuss statistical optimality for prediction based on the Bayes decision theory. The optimal prediction based on the Bayes decision theory is given by weighting all the model trees in the model tree candidate set, where the model tree candidate set is a set of model trees in which the true model tree is assumed to be included. Because the number of all the model trees in the model tree candidate set increases exponentially according to the maximum depth of model trees, the computational complexity of weighting them increases exponentially according to the maximum depth of model trees. To solve this issue, we introduce a notion of meta-tree and propose an algorithm called MTRF (Meta-Tree Random Forest) by using multiple meta-trees. Theoretical and experimental analyses of the MTRF show the superiority of the MTRF to previous decision tree-based algorithms.
AB - This paper deals with a prediction problem of a new targeting variable corresponding to a new explanatory variable given a training dataset. To predict the targeting variable, we consider a model tree, which is used to represent a conditional probabilistic structure of a targeting variable given an explanatory variable, and discuss statistical optimality for prediction based on the Bayes decision theory. The optimal prediction based on the Bayes decision theory is given by weighting all the model trees in the model tree candidate set, where the model tree candidate set is a set of model trees in which the true model tree is assumed to be included. Because the number of all the model trees in the model tree candidate set increases exponentially according to the maximum depth of model trees, the computational complexity of weighting them increases exponentially according to the maximum depth of model trees. To solve this issue, we introduce a notion of meta-tree and propose an algorithm called MTRF (Meta-Tree Random Forest) by using multiple meta-trees. Theoretical and experimental analyses of the MTRF show the superiority of the MTRF to previous decision tree-based algorithms.
KW - Bayes decision theory
KW - Data-generative model
KW - Meta-tree
KW - Prediction
KW - Random forest
UR - http://www.scopus.com/inward/record.url?scp=85108879389&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85108879389&partnerID=8YFLogxK
U2 - 10.3390/e23060768
DO - 10.3390/e23060768
M3 - Article
AN - SCOPUS:85108879389
SN - 1099-4300
VL - 23
JO - Entropy
JF - Entropy
IS - 6
M1 - 768
ER -