Motion-based segmentation is traditionally used for video object extraction. Objects are detected as groups of significant
moving regions and tracked through the sequence. However, this approach presents difficulties for video shots that contain
both static and dynamic moments, and detection is prone to fail in absence of motion. In addition, retrieval of static contents
is needed for high-level descriptions.
In this paper, we present a new graph-based approach to extract spatio-temporal regions. The method performs iteratively on
pairs of frames through a hierarchical merging process. Spatial merging is first performed to build spatial atomic regions,
based on color similarities. Then, we propose a new matching procedure for the temporal grouping of both static and moving
regions. A feature point tracking stage allows to create dynamic temporal edges between frames and group strongly connected
regions. Space-time constraints are then applied to merge the main static regions and a region graph matching stage completes
the procedure to reach high temporal coherence. Finally, we show the potential of our method for the segmentation of real
moving video sequences.