Lecture Notes in Computer Science, 2002, Volume 2423/2002, 125-136, DOI: 10.1007/3-540-45869-7_46

Text Verification in an Automated System for the Extraction of Bibliographic Data

George R. Thoma, Glenn Ford, Daniel Le and Zhirong Li

View Related Documents

Abstract

An essential stage in any text extraction system is the manual verification of the printed material converted by OCR. This proves to be the most labor-intensive step in the process. In a system built and deployed at the National Library of Medicine to automatically extract bibliographic data from scanned biomedical journals, alternative means were considered to validate the text. This paper describes two approaches and gives preliminary performance data.

Fulltext Preview

Image of the first page of the fulltext document