This paper discusses some of the issues in Web information extraction, focusing on automatic extraction methods that exploit
wrapper induction. In particular, we point out the limitations of traditional heuristic-based wrapper generation systems,
and as a solution, emphasize the importance of the domain knowledge in the process of wrapper generation.
We demonstrate the effectiveness of domain knowledge by presenting our scheme of knowledge-based wrapper generation for semi-structured
and labeled documents. Our agent-oriented information extraction system, XTROS, represents both the domain knowledge and the
wrappers by XML documents to increase modularity, flexibility, and interoperability. XTROS shows good performance on several
Web sites in the domain of real estate, and it is expected to be easily adaptable to different domains by plugging in appropriate
XML-based domain knowledge.