Leveraging Structural Knowledge for Hierarchically-Informed Keyword Weight Propagation in the Web
Jong Wook Kim1
and K. Selçuk Candan1 
| (1) |
Comp. Sci. and Eng. Dept., Arizona State University, Tempe, AZ 85287, |
Abstract
Although web navigation hierarchies, such as Yahoo.com and Open
Directory
Project, enable effective browsing, their individual nodes cannot be indexed for search independently. This is because contents of
the individual nodes in a hierarchy are related to the contents of their neighbors, ancestors, and descendants in the structure.
In this paper, we show that significant improvements in precision can be obtained by leveraging knowledge about the structure
of hierarchical web content. In particular, we propose a novel keyword weight propagation technique to properly enrich the
data nodes in web hierarchies. Our approach relies on leveraging the context provided by neighbor entries in a given structure.
We leverage this information for developing relative-content preserving keyword propagation schemes. We compare the results obtained through proposed hierarchically-informed keyword
weight (pre-) propagation schemes to existing state-of-the-art score and keyword propagation techniques and show that our
approach significantly improves the precision.
This is an extended version of a work originally published at the WebKDD’2006 workshop [15]. This work is supported by an NSF ITR Grant, ITR-0326544; “ILearn: IT-enabled Ubiquitous Access for Educational Opportunities for Blind Individuals” and an RSA Grant “Ubiquitous Environment to Facilitate Access to Textbooks and Related Materials for Adults and School Age Children who are
Blind or Visually Impaired”.
References secured to subscribers.