A document-form identification method based on constellation matching of targets is proposed. Mathematical analysis shows
that the method achieves a high identification rate by preparing plural targets. The method consists of two parts: (i) extraction
of targets such as important keywords in a document by template matching between recogised characters and word strings in
a keyword dictionary, and (ii) analysis of the positional or semantic relationship between the targets by point-pattern matching
between these targets and word location information in the keyword dictionary. All characters in the document are recognised
by means of a conventional character-recognition method. An automatic keyword-determination method, which is necessary for
making a keyword dictionary beforehand, is also proposed. This method selects the most suitable keywords from a general word
dictionary by measuring the uniqueness of keywords and the stability of their recognition. Experiments using 671 sample documents
with 107 different forms in total confirmed that (i) the keyword-determination method can determine sets of keywords automatically
in 92.5% of 107 different forms and (ii) that the form-identification method can correctly identify 97.1% of 671 document
samples at a rejection rate 2.9%.