This paper deals with the use of the dependencies between the textual indexation of an image (a set of keywords) and its visual indexation (colour and shape features). Experiments are realized on a corpus of photographs of a press agency (EDITING) and on another corpus of animals and landscape photographs (COREL). Both are manually indexed by keywords. Keywords of the news photos are extracted from a hierarchically structured thesaurus. Keywords of Corel corpus are semantically linked using WordNet database. A semantic clustering of the photos is constructed from their textual indexation. We use two different visual segmentation schemes. One is based on areas of interest, the other one on blobs of homogenous colour. Both segmentation schemes are used to evaluate the performance of a content-based image retrieval system combining textual and visual descriptions. Results of visuo-textual classifications show an improvement of 50% against classification using only textual information. Finally, we show how to apply this system in order to enhance a web image search engine. To this purpose, we illustrate a method allowing selecting only accurate images resulting from a textual query.
Keywords content-based image retrieval (CBIR) - visuo-textual fusion - vectorial model - multimedia indexing - Kullback-Leibler distance - segmentation
Sabrina Tollari received the B.S. and M.S. degrees in computer science, respectively, from the University of Toulon in 2001 and from the University of Marseilles in 2003. She is currently a Ph.D. student at the SIS laboratory (Systèmes-Information-Signal). Her main current research interests are multimedia information retrieval.Hervé Glotin received his Bachelor in computer science from the University of Paris VI. He got his Ph.D. in 2001 in Cognitive Sciences from the National Polytechnic Institute of Grenoble, France, dealing with robust automatic audiovisual speech recognition and computational auditory scene analysis. He shared his Ph.D. between two laboratories: IDIAP (EPFL-CH), and ICP (CNRS-Grenoble). In 2000 he was invited to the CSLP summer workshop as an expert working with the human language IBM team on

via voice

audiovisual system. In 2001 he was involved as a time life engineer/researcher at CNRS in ERSS laboratory, Toulouse-France, specialized in semantic and syntax language analysis. In 2002, he was invited to the NATO Advances Studies in dynamic of speech perception and production.
In 2003 he joined as a permanent researcher and assistant professor the computer science team of SIS Lab, at the University of Toulon, France. He is currently conducting research on content based information retrieval and automatic speech recognition systems. He is the author, or co-author of 35 conference or journal papers about speech or multimedia documents processing.
Jacques Le Maitre is a professor in computer science at the University of Toulon in France where he leads the SIS laboratory. His main current research interests include query languages for XML databases and multimedia information retrieval.