We present a probabilistic framework for component-based automatic detection and tracking of objects in video. We represent
objects as spatio-temporal two-layer graphical models, where each node corresponds to an object or component of an object
at a given time, and the edges correspond to learned spatial and temporal constraints. Object detection and tracking is formulated
as inference over a directed loopy graph, and is solved with non-parametric belief propagation. This type of object model
allows object-detection to make use of temporal consistency (over an arbitrarily sized temporal window), and facilitates robust
tracking of the object. The two layer structure of the graphical model allows inference over the entire object as well as
individual components. AdaBoost detectors are used to define the likelihood and form proposal distributions for components.
Proposal distributions provide ‘bottom-up’ information that is incorporated into the inference process, enabling automatic
object detection and tracking. We illustrate our method by detecting and tracking two classes of objects, vehicles and pedestrians,
in video sequences collected using a single grayscale uncalibrated car-mounted moving camera.