Information technology has developed past its traditional focus on text-based data. The much-cited rapid growth of available
information has been accompanied by a diversification of information types. Multimedia data is rapidly becoming the predominant
form of information created, processed and distributed in many application domains.Multimedia data sets are characterized
by their heterogeneous nature and complex structure. Documents often combine of different modalities, for example video streams,
audio streams and textual information. Document content often features a pronounced temporal component, as in the case of
audio and video data. Multimedia documents frequently include rich semantic descriptors and complex structures of cross-modal,
inter- and intra-document references.
It is often not feasible to manually annotate such complex semantics, especially in the context of very large data sets. Various
automated methods for the extraction of semantic metadata have been proposed and evaluated. However, the capabilities of automatic
extraction are limited in terms of accuracy, performance and diversity of results. Visualisation techniques employ the vast
processing power of the human visual apparatus to quickly identify complex patterns in large amounts of data. When combined
with machine processing capabilities, such techniques provide unparalleled means for gaining insight into large data sets
in general, and into multimedia data sets in particular.
The multi-faceted nature of multimedia documents has led to a variety of visual representations for navigating, analysing
and understanding of multimedia data sets. As each representation is specifically designed to address different aspects of
the data, innovative approaches combining several visualisations in a single coordinated interface had to be introduced. This
chapter presents a comparative discussion of selected multimedia visual representations and tools.