Biologists have determined that the control and regulation of gene expression is primarily determined by relatively short
sequences in the region surrounding a gene. These sequences vary in length, position, redundancy, orientation, and bases.
Finding these short sequences is a fundamental problem in molecular biology with important applications. Though there exist
many different approaches to signal/motif (i.e. short sequence) finding, in 2000 Pevzner and Sze reported that most current
motif finding algorithms are incapable of detecting the target signals in their so-called Challenge Problem. In this paper,
we show that using an iterative-restart design, our new algorithm can correctly find the targets. Furthermore, taking into
account the fact that some transcription factors form a dimer or even more complex structures, and transcription process can
sometimes involve multiple factors, we extend the original problem to an even more challenging one. We address the issue of
combinatorial signals with gaps of variable lengths. To demonstrate the efficacy of our algorithm, we tested it on a series
of the original and the new challenge problems, and compared it with some representative motif-finding algorithms. In addition,
to verify its feasibility in real-world applications, we also tested it on several regulatory families of yeast genes with
known motifs. The purpose of this paper is two-fold. One is to introduce an improved biological data mining algorithm that
is capable of dealing with more variable regulatory signals in DNA sequences. The other is to propose a new research direction
for the general KDD community.