Lecture Notes in Computer Science, 2007, Volume 4755/2007, 286-290, DOI: 10.1007/978-3-540-75488-6_31

Mining Subtrees with Frequent Occurrence of Similar Subtrees

Hisashi Tosaka, Atsuyoshi Nakamura and Mineichi Kudo

View Related Documents

Abstract

We study a novel problem of mining subtrees with frequent occurrence of similar subtrees, and propose an algorithm for this problem. In our problem setting, frequency of a subtree is counted not only for equivalent subtrees but also for similar subtrees. According to our experiment using tag trees of web pages, this problem can be solved fast enough for practical use. An encouraging result was obtained in a preliminary experiment for data record extraction from web pages using our mining method.

Fulltext Preview

Image of the first page of the fulltext document