TY - GEN

T1 - Evaluation of error probability of classification based on the analysis of the bayes code

AU - Saito, Shota

AU - Matsushima, Toshiyasu

N1 - Funding Information:
ACKNOWLEDGMENT This work was supported in part by JSPS KAKENHI Grant Numbers JP17K00316, JP17K06446, JP18K11585, JP19K04914, and JP19K14989.
Publisher Copyright:
© 2020 IEEE.

PY - 2020/6

Y1 - 2020/6

N2 - Suppose that we have two training sequences generated by parametrized distributions P θ 1∗ and P θ 2∗, where θ 1∗{\ast} and θ 2∗{\ast} are unknown. Given training sequences, we study the problem of classifying whether a test sequence was generated according to P θ 1∗ or P θ 2∗. This problem can be thought of as a hypothesis testing problem and the weighted sum of type-I and type-II error probabilities is analyzed. To prove the results, we utilize the analysis of the codeword lengths of the Bayes code. It is shown that upper and lower bounds of the probability of error are characterized by the terms containing the Chernoff information, the dimension of a parameter space, and the ratio of the length between the training sequences and the test sequence. Further, we generalize the part of the preceding results to multiple hypotheses setup.

AB - Suppose that we have two training sequences generated by parametrized distributions P θ 1∗ and P θ 2∗, where θ 1∗{\ast} and θ 2∗{\ast} are unknown. Given training sequences, we study the problem of classifying whether a test sequence was generated according to P θ 1∗ or P θ 2∗. This problem can be thought of as a hypothesis testing problem and the weighted sum of type-I and type-II error probabilities is analyzed. To prove the results, we utilize the analysis of the codeword lengths of the Bayes code. It is shown that upper and lower bounds of the probability of error are characterized by the terms containing the Chernoff information, the dimension of a parameter space, and the ratio of the length between the training sequences and the test sequence. Further, we generalize the part of the preceding results to multiple hypotheses setup.

UR - http://www.scopus.com/inward/record.url?scp=85090403904&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85090403904&partnerID=8YFLogxK

U2 - 10.1109/ISIT44484.2020.9173981

DO - 10.1109/ISIT44484.2020.9173981

M3 - Conference contribution

AN - SCOPUS:85090403904

T3 - IEEE International Symposium on Information Theory - Proceedings

SP - 2510

EP - 2514

BT - 2020 IEEE International Symposium on Information Theory, ISIT 2020 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2020 IEEE International Symposium on Information Theory, ISIT 2020

Y2 - 21 July 2020 through 26 July 2020

ER -