View Related Documents

Abstract

The paper describes a Sanskrit morphological analyzer that identifies and analyzes inflected noun-forms and verb-forms in any given sandhi-free text. The system which has been developed as java servlet RDBMS can be tested at http://sanskrit.jnu.ac.in (Language Processing Tools > Sanskrit Tinanta Analyzer/Subanta Analyzer) with Sanskrit data in Unicode text. Subsequently, the separate systems of subanta and tiṅanta will be combined into a single system of sentence analysis with karaka interpretation. Currently, the system checks and labels each word as three basic POS categories - subanta, tiṅanta, and avyaya. Thereafter, each subanta is sent for subanta processing based on an example database and a rule database. The verbs are examined based on a database of verb roots and forms as well by reverse morphology based on Paninian techniques. Future enhancements include plugging in the amarakośa (http://sanskrit.jnu.ac.in/amara) and other noun lexicons with the subanta system. The tiṅanta will be enhanced by the kṛdanta analysis module being developed separately.

Keywords  morphology - analyzer - subanta - tiṅanta - kṛdanta - taddhita - strīpratyaya - samāsa - avyaya - kāraka - vibhakti - vacana - sandhi - pada - prātipadika - pratyaya - sup - tiṅ - Pāṇini - sūtra - Aṣṭādhyāyī - POS - dhātu - dhātupāṭha - gaṇa - gaṇapāṭha - lakāra - dhāturūpa - śabdarūpa - java - JSP - servlet - Apache-Tomcat - RDBMS - SQL server - JDBC - Unicode

Fulltext Preview

Image of the first page of the fulltext document