The popularization of the Web has made a huge volume of data available for a large audience. In a large number of Web sites,
such as bookstores, electronic catalogs, travel agencies, etc., the pages constitute documents which are composed of pieces
of data whose overall structure can be easily recognized. Such pages are called data-rich and can be seen as collections of
complex objects. In this paper, we show how such objects can be represented by nested tables, which are simple, intuitive,
and quite convenient for expressing their implicit structure. The assumption is that, for most sites of interest, only few
examples are required to reveal the structure of the objects. To corroborate our assumption, we describe a data extraction
tool that adopts this approach and present results of some experiments carried out with this tool.
This work is supported by Project SIAM (grant MCT/FINEP/PRONEX 76.97.1016.00) and by individual research grants from CNPq
and CAPES.