Document layout analysis and reading order determination for a reading robot

Yucun Pan*, Qunfei Zhao, Seiichiro Kamata

*この研究の対応する著者

研究成果

9 被引用数 (Scopus)

抄録

In this paper an efficient approach of document layout analysis and reading order determination is proposed for a reading robot. Firstly the input document images are preprocessed to remove noises, connect lines and domains, and to reduce the computation time. Secondly a bottom-up, parameter-independent, two-step layout analysis algorithm based on morphology is used, which outlines the geometry of the maximum homogeneous regions and classifies them into texts, tables, and pictures. Finally the reading order is determined, by a top-down recursive hierarchy algorithm derived from XY-cut, using a set of rules depending on layout information. Important parameters are acquired using statistic information of the given images to adapt to different types of documents. The proposed algorithm is applied to a large number of document images and the experimental results show that it makes the reading robot be able to read paper documents of different languages, even with complex layout structure.

本文言語English
ホスト出版物のタイトルTENCON 2010 - 2010 IEEE Region 10 Conference
ページ1607-1612
ページ数6
DOI
出版ステータスPublished - 2010 12 1
イベント2010 IEEE Region 10 Conference, TENCON 2010 - Fukuoka, Japan
継続期間: 2010 11 212010 11 24

出版物シリーズ

名前IEEE Region 10 Annual International Conference, Proceedings/TENCON

Other

Other2010 IEEE Region 10 Conference, TENCON 2010
国/地域Japan
CityFukuoka
Period10/11/2110/11/24

ASJC Scopus subject areas

  • コンピュータ サイエンスの応用
  • 電子工学および電気工学

フィンガープリント

「Document layout analysis and reading order determination for a reading robot」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル