A contour-based robust algorithm for text detection in color images

Yangxing Liu, Satoshi Goto, Takeshi Ikenaga

Research output: Contribution to journalArticle

47 Citations (Scopus)

Abstract

Text detection in color images has become an active research area in the past few decades. In this paper, we present a novel approach to accurately detect text in color images possibly with a complex background. The proposed algorithm is based on the combination of connected component and texture feature analysis of unknown text region contours. First, we utilize an elaborate color image edge detection algorithm to extract all possible text edge pixels. Connected component analysis is performed on these edge pixels to detect the external contour and possible internal contours of potential text regions. The gradient and geometrical characteristics of each region contour are carefully examined to construct candidate text regions and classify part non-text regions. Then each candidate text region is verified with texture features derived from wavelet domain. Finally, the Expectation maximization algorithm is introduced to binarize each text region to prepare data for recognition. In contrast to previous approach, our algorithm combines both the efficiency of connected component based method and robustness of texture based analysis. Experimental results show that our proposed algorithm is robust in text detection with respect to different character size, orientation, color and language and can provide reliable text binarization result.

Original languageEnglish
Pages (from-to)1221-1230
Number of pages10
JournalIEICE Transactions on Information and Systems
VolumeE89-D
Issue number3
DOIs
Publication statusPublished - 2006

Fingerprint

Color
Textures
Pixels
Edge detection

Keywords

  • Connected component analysis
  • Edge detection
  • Region contour
  • Text detection
  • Texture analysis

ASJC Scopus subject areas

  • Information Systems
  • Computer Graphics and Computer-Aided Design
  • Software

Cite this

A contour-based robust algorithm for text detection in color images. / Liu, Yangxing; Goto, Satoshi; Ikenaga, Takeshi.

In: IEICE Transactions on Information and Systems, Vol. E89-D, No. 3, 2006, p. 1221-1230.

Research output: Contribution to journalArticle

@article{c3204b6c42304c8da34a53acf76bf4a6,
title = "A contour-based robust algorithm for text detection in color images",
abstract = "Text detection in color images has become an active research area in the past few decades. In this paper, we present a novel approach to accurately detect text in color images possibly with a complex background. The proposed algorithm is based on the combination of connected component and texture feature analysis of unknown text region contours. First, we utilize an elaborate color image edge detection algorithm to extract all possible text edge pixels. Connected component analysis is performed on these edge pixels to detect the external contour and possible internal contours of potential text regions. The gradient and geometrical characteristics of each region contour are carefully examined to construct candidate text regions and classify part non-text regions. Then each candidate text region is verified with texture features derived from wavelet domain. Finally, the Expectation maximization algorithm is introduced to binarize each text region to prepare data for recognition. In contrast to previous approach, our algorithm combines both the efficiency of connected component based method and robustness of texture based analysis. Experimental results show that our proposed algorithm is robust in text detection with respect to different character size, orientation, color and language and can provide reliable text binarization result.",
keywords = "Connected component analysis, Edge detection, Region contour, Text detection, Texture analysis",
author = "Yangxing Liu and Satoshi Goto and Takeshi Ikenaga",
year = "2006",
doi = "10.1093/ietisy/e89-d.3.1221",
language = "English",
volume = "E89-D",
pages = "1221--1230",
journal = "IEICE Transactions on Information and Systems",
issn = "0916-8532",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "3",

}

TY - JOUR

T1 - A contour-based robust algorithm for text detection in color images

AU - Liu, Yangxing

AU - Goto, Satoshi

AU - Ikenaga, Takeshi

PY - 2006

Y1 - 2006

N2 - Text detection in color images has become an active research area in the past few decades. In this paper, we present a novel approach to accurately detect text in color images possibly with a complex background. The proposed algorithm is based on the combination of connected component and texture feature analysis of unknown text region contours. First, we utilize an elaborate color image edge detection algorithm to extract all possible text edge pixels. Connected component analysis is performed on these edge pixels to detect the external contour and possible internal contours of potential text regions. The gradient and geometrical characteristics of each region contour are carefully examined to construct candidate text regions and classify part non-text regions. Then each candidate text region is verified with texture features derived from wavelet domain. Finally, the Expectation maximization algorithm is introduced to binarize each text region to prepare data for recognition. In contrast to previous approach, our algorithm combines both the efficiency of connected component based method and robustness of texture based analysis. Experimental results show that our proposed algorithm is robust in text detection with respect to different character size, orientation, color and language and can provide reliable text binarization result.

AB - Text detection in color images has become an active research area in the past few decades. In this paper, we present a novel approach to accurately detect text in color images possibly with a complex background. The proposed algorithm is based on the combination of connected component and texture feature analysis of unknown text region contours. First, we utilize an elaborate color image edge detection algorithm to extract all possible text edge pixels. Connected component analysis is performed on these edge pixels to detect the external contour and possible internal contours of potential text regions. The gradient and geometrical characteristics of each region contour are carefully examined to construct candidate text regions and classify part non-text regions. Then each candidate text region is verified with texture features derived from wavelet domain. Finally, the Expectation maximization algorithm is introduced to binarize each text region to prepare data for recognition. In contrast to previous approach, our algorithm combines both the efficiency of connected component based method and robustness of texture based analysis. Experimental results show that our proposed algorithm is robust in text detection with respect to different character size, orientation, color and language and can provide reliable text binarization result.

KW - Connected component analysis

KW - Edge detection

KW - Region contour

KW - Text detection

KW - Texture analysis

UR - http://www.scopus.com/inward/record.url?scp=33645766319&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33645766319&partnerID=8YFLogxK

U2 - 10.1093/ietisy/e89-d.3.1221

DO - 10.1093/ietisy/e89-d.3.1221

M3 - Article

AN - SCOPUS:33645766319

VL - E89-D

SP - 1221

EP - 1230

JO - IEICE Transactions on Information and Systems

JF - IEICE Transactions on Information and Systems

SN - 0916-8532

IS - 3

ER -