As web pages are created, destroyed, and updated dynamically, web databases should be frequently updated to keep web pages
up-to-date. Understanding the change behavior of web pages certainly helps the administrators manage their web databases.
This paper introduces a number of metrics representing various change behavior of the web pages. We have monitored approximately
1.8 million to three million URLs at two-day intervals for 100 days. Using the metrics we propose, we analyze the collected
URLs and web pages. In addition, we propose a method that computes the probability that a page will be downloaded on the next
crawls.
This work was supported by Korea Research Foundation Grant. (KRF-2004-D00172).