Lecture Notes in Computer Science, 2005, Volume 3399/2005, 938-943, DOI: 10.1007/978-3-540-31849-1_89

PlusDBG: Web Community Extraction Scheme Improving Both Precision and Pseudo-Recall

Naoyuki Saida, Akira Umezawa and Hayato Yamana

View Related Documents

Abstract

This paper proposes PlusDBG to improve both precision and pseudo-recall by extending the conventional Web community extraction scheme. Precision is defined as the percentage of relevant Web pages extracted as members of Web communities and pseudo-recall is defined as the sum of the number of relevant Web pages extracted as members of Web communities. The proposed scheme adopts the new distance parameter defined by the relevance between a Web page and a Web community, and extracts the Web community with higher precision and pseudo-recall. Moreover, we have implemented and evaluated the proposed scheme. Our results confirm that the proposed scheme is able to extract about 3.2-fold larger numbers of members of Web communities than the conventional scheme, while maintaining equivalent precision.

Fulltext Preview

Image of the first page of the fulltext document