View Related Documents

Abstract

In mathematical OCR, it is necessary to analyze two-dimensional structures of the component characters and symbols in mathematical expressions printed in scientific documents. In this paper, we analyze the positional relationships between adjacent characters for the purpose of automatic discrimination between baseline characters, subscripts, and superscripts, which is one of the most important and delicate parts of structure analysis. It has been proven through a large-scale experiment that this discrimination can be carried out almost perfectly (~ 99.89%) by using the relative size and position of adjacent characters.

Mathematics Subject Classification (2000).  68T05 - 68T10

Keywords.  Mathematical documents - structure analysis of mathematical expression - subscript and superscript

Fulltext Preview

Image of the first page of the fulltext document