Data Augmentation for Ancient Characters via Semi-MixFontGan

Zhiyi Yuan, Sei Ichiro Kamata

研究成果: Conference contribution

抄録

The ancient documents provide people a way to understand history. However, the existing materials are suffering from unbalanced characters dataset, as well as intra-class multimodality fonts. As a result, humans and recognition systems are unable to identify these characters effectively. Based on these problems, we propose Semi-MixFontGan: a font generation method based on Semi-Supervised strategy that can learn from a small number of labeled font data to aggregate subclasses' information of categories and generate characters. In generating new samples from ancient books that have a small amount of labeled font data, the model can automatically learn the difference between them and generate font-consistent characters. The model is composed of two parts. In the first part, we propose a MixFont method to mix labeled and unlabeled and generated data. Then use a convolutional autoencoder to learn the font information. In the second part, the generator network can generate reasonable and realistic images by Font and Content Discriminator. Through this model, we can make the ancient book dataset more balanced. Experiments show that the generated characters by our model can get good visual effects and maintain font consistency with training data. With the augmented data, the accuracy of the recognition network has increased. Contribution-We propose a novel font generation method with semi-supervised learning to generate characters from small labeled font Kuzushiji dataset.

本文言語English
ホスト出版物のタイトル2020 Joint 9th International Conference on Informatics, Electronics and Vision and 2020 4th International Conference on Imaging, Vision and Pattern Recognition, ICIEV and icIVPR 2020
出版社Institute of Electrical and Electronics Engineers Inc.
ISBN(電子版)9781728193311
DOI
出版ステータスPublished - 2020 8 26
イベントJoint 9th International Conference on Informatics, Electronics and Vision and 4th International Conference on Imaging, Vision and Pattern Recognition, ICIEV and icIVPR 2020 - Kitakyushu, Japan
継続期間: 2020 8 262020 8 29

出版物シリーズ

名前2020 Joint 9th International Conference on Informatics, Electronics and Vision and 2020 4th International Conference on Imaging, Vision and Pattern Recognition, ICIEV and icIVPR 2020

Conference

ConferenceJoint 9th International Conference on Informatics, Electronics and Vision and 4th International Conference on Imaging, Vision and Pattern Recognition, ICIEV and icIVPR 2020
CountryJapan
CityKitakyushu
Period20/8/2620/8/29

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Information Systems
  • Electrical and Electronic Engineering
  • Instrumentation

フィンガープリント 「Data Augmentation for Ancient Characters via Semi-MixFontGan」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル