The study presented in this paper analyses the visual MPEG-7 descriptors from a statistical point of view. A statistical analysis is able to reveal the properties and qualities of the used descriptors: redundancies, sensitivity to media content, etc. These aspects were not considered in the MPEG-7 design process where the major goal was optimising the retrieval rate. For the statistical analysis eight basic visual descriptors were applied to three media collections: the Brodatz dataset, a selection of the Corel photo dataset and a set of coats-of-arms images. The resulting feature vectors were analysed with four statistical methods: mean and variance of description elements, distribution of elements, cluster analysis (hierarchical and topological) and factor analysis. The analysis revealed that, for example, most MPEG-7 descriptions are highly redundant and sensitive to the presence of colour shades.
Keywords: Visual information retrieval - MPEG-7 - Cluster analysis - Factor analysis - Self-organizing map
Published online: 6 October 2004