Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
My Menu
Saved Items

Semantic Similarity Measure of Polish Nouns Based on Linguistic Features

Maciej PiaseckiContact Information and Bartosz Broda1

(1)  Institute of Applied Informatics, Wrocław University of Technology, Wybrzeże Wyspiańskiego 27, Wrocław, Poland
Abstract
A word-to-word similarity function automatically extracted from a corpus of texts can be a very helpful tool in automatic extraction of lexical semantic relations. There are many approaches for English, but only a few for inflective languages with almost free word order. In the paper a method for the construction of a similarity function for Polish nouns is proposed. The method uses only simple tools for language processing (e.g. it does need the application of a parser). The core is the construction of a matrix of co-occurrences of nouns and adjectives on the basis of application of morpho-syntactic constraints testing agreement between an adjective and a noun. Several methods of transformation of the matrix and calculation of the similarity function are presented. The achieved accuracy of 81.15% in WordNet-based Synonymy Test (for 4 611 Polish nouns, using the current version of Polish WordNet) seems to be comparable with the best results reported for English (e.g. 75.8% [5]).

Keywords  semantic similarity function - Polish - automatic extraction - nouns - LSA


Contact Information Maciej Piasecki
Email: maciej.piasecki@pwr.wroc.pl
Fulltext Preview (Small, Large)
Image of the first page of the fulltext

References secured to subscribers.



Export this chapter
Export this chapter as RIS | Text
 
Remote Address: 38.107.191.114 • Server: mpweb21
HTTP User Agent: CCBot/1.0 (+http://www.commoncrawl.org/bot.html)