A novel probabilistic framework is proposed for inferring gaze patterns and the structure of conversation in face-to-face
multiparty communication, based on head directions and the presence/absence of utterances of participants. First, we define
three classes of conversational regimes, which are characterized by the topology of the gaze pattern; we assume that they
indicate the structure of the conversation, i.e. who is talking to whom. Next, the problem is formulated as joint estimation
of both regime state from the gaze pattern and utterance, and the gaze pattern from head directions. We then devise a dynamic
Bayesian network, called the Markov-switching model. The regime changes over time are based on Markov transitions, and controls
the dynamics of the gaze patterns and utterances. Furthermore, Bayesian estimation of regime, gaze pattern, and model parameters
are implemented using a Markov chain Monte Carlo method. Experiments on four-person conversations confirm accurate gaze estimation
and the effectiveness of the framework toward identification of the conversation structures.