This paper presents a solution to the problem of tracking people within crowded scenes. The aim is to maintain individual
object identity through a crowded scene which contains complex interactions and heavy occlusions of people. Our approach uses
the strengths of two separate methods; a global object detector and a localised frame by frame tracker. A temporal relationship
model of torso detections built during low activity period, is used to further disambiguate during periods of high activity.
A single camera with no calibration and no environmental information is used. Results are compared to a standard tracking
method and groundtruth. Two video sequences containing interactions, overlaps and occlusions between people are used to demonstrate
our approach. The results show that our technique performs better that a standard tracking method and can cope with challenging
occlusions and crowd interactions.