Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
|
 |
Split Algorithms for Sets of High-Dimensional Objects
| |
|
Split Algorithms for Sets of High-Dimensional Objects
Jukka Teuhola7 
| (7) |
Turku Centre for Computer Science (TUCS), Lemminkäisenkatu 14 A, FIN-20520 Turku, Finland |
Abstract
This study compares general, heuristic algorithms for splitting a set of high-dimensional objects. The algorithms aim at minimizing
the overlap of the created subsets, while simultaneously satisfying a given balancing condition for their cardinalities. The
intuitive goal is to enhance clustering and disjointness, and thereby to improve retrieval performance. Three categories of
complexity are studied: O(N), O(N log N), and O(N
2), when splitting a set of N objects. A representative algorithm is suggested for each category, the main contribution being
a new class of quadratic algorithms which take advantage of existing heuristics for the traveling salesman problem. All algorithms
are quite general, making minimal assumptions about the underlying object domain. However, for low-dimensional (e.g. spatial)
objects, specialized split techniques produce better results. The experiments are therefore restricted to splitting a set
of signatures (bit-vectors in general). The trade-off between computation effort and result quality is clearly established.
Fulltext Preview (Small, Large)
 References secured to subscribers.
|
|
|
|
|
|