View Related Documents

Abstract

The capability of extracting and recognizing characters printed in color documents will widen immensely the applications of OCR systems. This paper describes a new method of color segmentation to extract character areas from a color document. At first glance, the characters seem to be printed in a single color, but actual measurements reveal that the color image has a distribution of components. Compared with clustering algorithms, our method prevents oversegmentation and fusion with the background while maintaining real-time usability. It extracts the representative colors based on a histogram analysis of the color space. Our method also contains a selective local color averaging technique that removes the problem of mesh noise on high-resolution color images.
Received: 25 July 2003, Revised: 10 August 2003, Published online: 6 February 2004
Correspondence to: Hiroyuki Hase. Current address: 3-9-1 Bunkyo, Fukui-shi 910-8507, Japan

Fulltext Preview

Image of the first page of the fulltext document