Using ASR methods for OCR

Ashish Arora, Paola Garcia, Shinji Watanabe, Vimal Manohar, Yiwen Shao, Sanjeev Khudanpur, Chun Chieh Chang, Babak Rekabdar, Bagher Babaali, Daniel Povey, David Etter, Desh Raj, Hossein Hadian, Jan Trmal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Hybrid deep neural network hidden Markov models (DNN-HMM) have achieved impressive results on large vocabulary continuous speech recognition (LVCSR) tasks. However, the recent approaches using DNN-HMM models are not explored much for text recognition. Inspired by the current work in automatic speech recognition (ASR) and machine translation, we present an open vocabulary sub-word text recognition system. The sub-word lexicon and sub-word language model (LM) helps in overcoming the challenge of recognizing out of vocabulary (OOV) words, and a time delay neural network (TDNN) and convolution neural network (CNN) based DNN-HMM optical model (OM) efficiently models the sequence dependency in the line image. We present results on 12 datasets with training data varying from 6k lines to 600k lines. The system is built for 8 languages, i.e., English, French, Arabic, Chinese, Farsi, Tamil, Russian, and Korean. We report competitive results on several commonly used handwritten and printed text datasets.

Original languageEnglish
Title of host publicationProceedings - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019
PublisherIEEE Computer Society
Pages663-668
Number of pages6
ISBN (Electronic)9781728128610
DOIs
Publication statusPublished - 2019 Sep
Externally publishedYes
Event15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019 - Sydney, Australia
Duration: 2019 Sep 202019 Sep 25

Publication series

NameProceedings of the International Conference on Document Analysis and Recognition, ICDAR
ISSN (Print)1520-5363

Conference

Conference15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019
CountryAustralia
CitySydney
Period19/9/2019/9/25

Keywords

  • ASR
  • BPE
  • LF MMI
  • OCR
  • Open Vocabulary

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition

Fingerprint Dive into the research topics of 'Using ASR methods for OCR'. Together they form a unique fingerprint.

Cite this