Automated video surveillance applications require accurate separation of foreground and background image content. Cost-sensitive
embedded platforms place real-time performance and efficiency demands on techniques to accomplish this task. In this chapter,
we evaluate pixel-level foreground extraction techniques for a low-cost integrated surveillance system. We introduce a new
adaptive background modeling technique, multimodal mean (MM), which balances accuracy, performance, and efficiency to meet
embedded system requirements. Our evaluation compares several pixel-level foreground extraction techniques in terms of their
computation and storage requirements, and functional accuracy for three representative video sequences. The proposed MM algorithm
delivers comparable accuracy of the best alternative (mixture of Gaussians) with a 6× improvement in execution time and an
18% reduction in required storage on an eBox-2300 embedded platform.