This paper describes an international effort to unify a multimodal behavior generation framework for Embodied Conversational
Agents (ECAs). We propose a three stage model we call SAIBA where the stages represent intent planning, behavior planning
and behavior realization. A Function Markup Language (FML), describing intent without referring to physical behavior, mediates
between the first two stages and a Behavior Markup Language (BML) describing desired physical realization, mediates between
the last two stages. In this paper we will focus on BML. The hope is that this abstraction and modularization will help ECA
researchers pool their resources to build more sophisticated virtual humans.