Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
My Menu
Saved Items

Learnable Focused Crawling Based on Ontology

Hai-Tao Zheng1, Bo-Yeong Kang1 and Hong-Gee KimContact Information

(1)  Biomedical Knowledge Engineering Laboratory, Dentistry College, Seoul National University, 28 Yeongeon-dong, Jongro-gu, Seoul, Korea
Abstract
Focused crawling is proposed to selectively seek out pages that are relevant to a predefined set of topics. Since an ontology is a well-formed knowledge representation, ontology-based focused crawling approaches have come into research. However, since these approaches apply manually predefined concept weights to calculate the relevance scores of web pages, it is difficult to acquire the optimal concept weights to maintain a stable harvest rate during the crawling process. To address this issue, we propose a learnable focused crawling approach based on ontology. An ANN (Artificial Neural Network) is constructed by using a domain-specific ontology and applied to the classification of web pages. Experiments have been performed, and the results show that our approach outperforms the breadth-first search crawling approach, the simple keyword-based crawling approach, and the focused crawling approach using only the domain-specific ontology.

Contact Information Hong-Gee Kim
Email: hgkim@snu.ac.kr
Fulltext Preview (Small, Large)
Image of the first page of the fulltext

References secured to subscribers.



Export this chapter
Export this chapter as RIS | Text
 
Remote Address: 38.107.191.110 • Server: mpweb01
HTTP User Agent: CCBot/1.0 (+http://www.commoncrawl.org/bot.html)