This paper discusses how to count and generate strings that are ``distinct'' in two senses:
p -distinct and
b -distinct. Two strings
x on alphabet
A and
x' on alphabet
A' are said to be
p -distinct iff they represent distinct ``patterns''; that is, iff there exists no one—one mapping from
A to
A' that transforms
x into
x' . Thus
aab and
baa are
p -distinct while
aab and
ddc are
p -equivalent. On the other hand,
x and
x' are said to be
b -distinct iff they give rise to distinct border (failure function) arrays: thus
aab with border array
010 is
b -distinct from
aba with border array
001 . The number of
p -distinct (resp.
b -distinct) strings of length
n formed using exactly
k different letters is the
[k,n] entry in an infinite
p' (resp.
b' ) array. Column sums
p[n] and
b[n] in these arrays give the number of distinct strings of length
n . We present algorithms to compute, in constant time per string, all
p -distinct (resp.
b -distinct) strings of length
n formed using exactly
k letters, and we also show how to compute all elements
p'[k,n] and
b'[k,n] . These ideas and results have application to the efficient generation of appropriate test data sets for many string algorithms.
Key words. String, Word, Algorithm, Testing, Distinct.
Received December 21, 1995; revised April 28, 1997.