In this paper we describe a stemming algorithm for Galician language, which supports, at the same time, the four current orthographic
regulations for Galician. The algorithm has already been implemented, and we have started to use it for its improvement. But
this stemming algorithm cannot be applied over documents previous to the appearance of the first Galician orthographic regulation
in 1977; therefore we have adopted an exhaustive approach, consisting in defining a huge collection of wordsets for allowing systematic word comparisons, to stem documents written before that date. We also describe here a tool to build
the wordsets needed in this approach.
Keywords Stemming - Digital Libraries - Text Retrieval
This work was partially granted by CICYT (TEL99-0335-C04-02) and the Vicerrectorado de Innovation Tecnoloxica (University
of A Coruña).