Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
|
 |
Some Criterions for Selecting the Best Data Abstractions
| |
|
Some Criterions for Selecting the Best Data Abstractions
Makoto Haraguchi2 and Yoshimitsu Kudoh2 
| (2) |
Division of Electronics and Information Engineering, Hokkaido University, N-13, W-8, 060-8628 Sapporo, Japan |
Abstract
This paper presents and summarizes some criterions for selecting the best data abstraction for relations in relational databases.
The data abstraction can be understood as a grouping of attribute values whose individual aspects are forgotten and are therefore
abstracted to some more abstract value together. Consequently, a relation after the abstraction is a more compact one for
which data miners will work efficiently. It is however a major problem that, when an important aspect of data values is neglected
in the abstraction, then the quality of extracted knowledge becomes worse. So, it is the central issue to present a criterion
under which only an adequate data abstraction is selected so as to keep the important information and to reduce the sizes
of relations at the same time. From this viewpoint, we present in this paper three criterions and test them for a task of
classifying tuples in a relation given several target classes. All the criterions are derived from a notion of similarities
among class distributions, and are formalized based on the standard information theory. We also summarize our experimental
results for the classification task, and discuss a future work.
Fulltext Preview (Small, Large)
 References secured to subscribers.
|
|
|
|
|
|