Managers of electronic commerce sites need to learn as much as possible about their customers and those browsing their virtual
premises, in order to maximise the return on marketing expenditure. The discovery of marketing related navigation patterns
requires the development of data mining algorithms capable of the discovery of sequential access patterns from web logs. This
paper introduces a new algorithm called M
i
DAS that extends traditional sequence discovery with a wide range of web-specific features. Domain knowledge is described as
flexible navigation templates that can specify generic navigational behaviour of interest, network structures for the capture
of web site topologies, concept hierarchies and syntactic constraints. Unlike existing approaches MiDAS supports sequence discovery from multidimensional data, which allows the detection of sequences across monitored attributes,
such as URLs and http referrers. Three methods for pruning the sequences, resulting in three different types of navigational
behaviour are presented. The experimental evaluation has shown promising results in terms of functionality as well as scalability.
This research has partly been funded by the ESPRIT project No 26749 (MIMIC — Mining the Internet for Marketing IntelligenCe).