Streaming applications, such as environment monitoring and vehicle location tracking require handling high volumes of continuously
arriving data and sudden fluctuations in these volumes while efficiently supporting multi-dimensional historical queries.
The use of the traditional database management systems is inappropriate because they require excessive number of disk I/O
in continuously updating massive data streams. In this paper, we propose DCF (Data Stream Clustering Framework), a novel framework
that supports efficient data stream archiving for streaming applications. DCF can reduce a great amount of disk I/O in the
storage system by grouping incoming data into clusters and storing them instead of raw data elements. In addition, even when
there is a temporary fluctuation in the amount of incoming data, it can stably support storing all incoming raw data by controlling
the cluster size. Our experimental results show that our approach significantly reduces the number of disk accesses in terms
of both inserting and retrieving data.
Keywords Data Archiving - OLAP - Clustering - R-tree - Fast Insertion - Query Performance