Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
My Menu
Saved Items

Per-node Optimization of Finite-State Mechanisms for Natural Language Processing

Alexander TroussovContact Information, Brian O’DonovanContact Information, Seppo KoskenniemiContact Information and Nikolay GlushnevContact Information

(5)  IBM Dublin Software Lab, Airways Ind. Est., Cloghran, Dublin 17, Ireland
(6)  Oy IBM Ab., P.O.Box 265, 00101 Helsinki, Finland
Abstract
Finite-state processing is typically based on structures that allow for efficient indexing and sequential search. However, this “rigid” framework has several disadvantages when used in natural language processing, especially for non-alphabetical languages. The solution is to systematically introduce polymorphic programming techniques that are adapted to particular cases. In this paper we describe the structure of a morphological dictionary implemented with finite-state automata using variable or polymorphic node formats. Each node is assigned a format from a predefined set reflecting its utility in corpora processing as measured by a number of graph theoretic metrics and statistics. Experimental results demonstrate that this approach permits a 52% increase in the performance of dictionary look-up.

Contact Information Alexander Troussov
Email: atrousso@ie.ibm.com

Contact Information Brian O’Donovan
Email: Brian_ODonovan@ie.ibm.com

Contact Information Seppo Koskenniemi
Email: Seppo.Koskenniemi@fi.ibm.com

Contact Information Nikolay Glushnev
Email: nglushnev@ie.ibm.com
Fulltext Preview (Small, Large)
Image of the first page of the fulltext

References secured to subscribers.



Export this chapter
Export this chapter as RIS | Text
 
Remote Address: 38.107.191.106 • Server: mpweb23
HTTP User Agent: CCBot/1.0 (+http://www.commoncrawl.org/bot.html)