Human Interactive Proofs and Document Image Analysis
Henry S. Baird6
and Kris Popat6 
| (6) |
Palo Alto Research Center, 3333 Coyote Hill Road, 94304 Palo Alto, CA, USA |
Abstract
The recently initiated and rapidly developing research field of ‘human interactive proofs’ (HIPs) and its implications for
the document image analysis (DIA) research field are described. Over the last five years, efforts to defend Web services against
abuse by programs (‘bots’) have led to a new family of security protocols able to distinguish between human and machine users.
AltaVista pioneered this technology in 1997 [Bro01, LBBB01]. By the summer of 2000, Yahoo! and PayPal were using similar methods.
In the Fall of 2000, Prof. Manuel Blum of Carnegie-Mellon University and his team, stimulated by Udi Manber of Yahoo!, were
studying these and related problems [BAL00]. Soon thereafter a collaboration between the University of California at Berkeley
and the Palo Alto Research Center (PARC) built a tool based on systematically generated image degradations [CBF01]. In January
2002, Prof. Blum and the present authors ran the first workshop (at PARC) on HIPs, defined broadly as a class of challenge/response
protocols which allow a human to authenticate herself as a member of a given group - e.g. human (vs. machine), herself (vs.
anyone else), an adult (vs. a child), etc. All commercial uses of HIPs known to us exploit the gap in ability between human
and machine vision systems in reading images of machine printed text. Many technical issues that have been systematically
studied by the DIA community are relevant to the HIP research program. This paper describes the evolution of HIP R& D, applications
of HIPs now and on the horizon, highlights of the first HIP workshop, and proposals for a DIA research agenda to advance the
state of the art of HIPs.
Keywords Human interactive proofs - document image analysis - CAPTCHAs - abuse of web sites and services - the chatroom problem - human/machine discrimination - Turing tests - OCR performance evaluation - document image degradations - legibility of text
References secured to subscribers.