We present a methodology for document processing that exploits logic-based machine learning techniques. Our claim is that
information capture and indexing can profit by the identification of the document class and of specific function of its single
layout components. Indeed, the application of incremental and multistrategy machine learning techniques, rather than the classic
ones, allows for an efficient solution to the problem of information capture.