Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
My Menu
Saved Items

Time-Interval Sampling for Improved Estimations in Data Warehouses

Pedro FurtadoContact Information and João Pedro CostaContact Information

(7)  Dep. Engenharia Informática, Universidade de Coimbra, Polo II, Pinhal de Marrocos, 3030 Coimbra, Portugal
(8)  Dep. Informática e de Sistemas, Instituto Superior de Engenharia de Coimbra, Quinta da Nora, Rua Pedro Nunes, 3030-119 Coimbra, Portugal
Abstract
In large data warehouses it is possible to return very fast approximate answers to user queries using pre-computed sampling summaries well-fit for all types of exploration analysis. However, their usage is constrained by the fact that there must be a representative number of samples in grouping intervals to yield acceptable accuracy. In this paper we propose and evaluate a technique that deals with the representation issue by using time interval-biased stratified samples (TISS). The technique is able to deliver fast accurate analysis to the user by taking advantage of the importance of the time dimension in most user analysis. It is designed as a transparent middle layer, which analyzes and rewrites the query to use a summary instead of the base data warehouse. The estimations and error bounds returned using the technique are compared to those of traditional sampling summaries, to show that it achieves significant improvement in accuracy.

Contact Information Pedro Furtado
Email: pnf@dei.uc.pt

Contact Information João Pedro Costa
Email: jcosta@isec.pt
Fulltext Preview (Small, Large)
Image of the first page of the fulltext

References secured to subscribers.



Export this chapter
Export this chapter as RIS | Text
 
Remote Address: 38.107.191.106 • Server: mpweb17
HTTP User Agent: CCBot/1.0 (+http://www.commoncrawl.org/bot.html)