A hierarchical representation of objects is dynamically generated from the input of a virtual vision system. It is used to
analyze a sequence of actions and extract behavior rules that can be utilized by the inference machine CLIPS. The vision system
is assumed to provide simplified positional and shape information about visible 3D silhouettes in a frame per frame basis.
A virtual agent, attempts to keep track of every image, without any previous knowledge about the object it represents. The
hierarchy is restructured as necessary, to include new perceived images, in such a way that it also reflects factual relationships
amongst them. Modifications between consecutive frames are internally interpreted and represented as functions which take
the original world description and transform it into the next frame. A partial order is defined while looking for the satisfaction
of domain/codomain requirements in functions composition, thus leading to the CLIPS rules.