Robotic surgical systems such as Intuitive Surgical’s da Vinci system provide a rich source of motion and video data from
surgical procedures. In principle, this data can be used to evaluate surgical skill, provide surgical training feedback, or
document essential aspects of a procedure. If processed online, the data can be used to provide context-specific information
or motion enhancements to the surgeon. However, in every case, the key step is to relate recorded motion data to a model of
the procedure being performed. This paper examines our progress at developing techniques for “parsing” raw motion data from
a surgical task into a labelled sequence of surgical gestures. Our current techniques have achieved > 90% fully automated
recognition rates on 15 datasets.