The State and University Library of Denmark is developing an integrated search system called Summa, and as part of the Summa
project a clustering module and a facet module. Simple clusters have been created for a collection of more than six and a
half million library metadata records using a linear clustering algorithm. The created clusters are used to enrich the metadata
records, and search results are presented to the user using a faceted browsing interface alongside a ranked result list. The
most frequent tags in the different facets in the search result can be calculated and presented at a rate of approximately
three million records per second per machine.
Keywords Library Metadata - Large Data Sets - Clustering - Categorisation - Faceted Browsing