Volume 12, Number 1, 55-78, DOI: 10.1007/s10044-007-0096-4

The aspect Bernoulli model: multiple causes of presences and absences

Ella Bingham, Ata Kabán and Mikael Fortelius

View Related Documents

Abstract

We present a probabilistic multiple cause model for the analysis of binary (0–1) data. A distinctive feature of the aspect Bernoulli (AB) model is its ability to automatically detect and distinguish between “true absences” and “false absences” (both of which are coded as 0 in the data), and similarly, between “true presences” and “false presences” (both of which are coded as 1). This is accomplished by specific additive noise components which explicitly account for such non-content bearing causes. The AB model is thus suitable for noise removal and data explanatory purposes, including omission/addition detection. An important application of AB that we demonstrate is data-driven reasoning about palaeontological recordings. Additionally, results on recovering corrupted handwritten digit images and expanding short text documents are also given, and comparisons to other methods are demonstrated and discussed.

Keywords  Data mining - Probabilistic latent variable models - Multiple cause models - 0–1 data

A part of the work of Ella Bingham was performed while visiting the School of Computer Science, University of Birmingham, UK.

Fulltext Preview

Image of the first page of the fulltext document