Building morphological analyzer for Nepali

Authors

  • Shahid Mushtaq Bhat Linguistic Data Consortium for Indian Languages
  • Rupesh Rai Central Institute of Indian Languages

Keywords:

Morphological analyzer, Word and paradigm model, Apertium, LT-Tool Box, Paradigm, Concatenative Morphology, Machine Translation, Devnagri, Transliteration

Abstract

Morphological analyzer is a fundamental tool in Natural Language Processing (NLP) that generates the morphological analyses of a given word-form. It can be used in enhancing the accuracy of POS-Tagging, Chunking, Syntactic Parsing, Word Sense Disambiguation (WSD), Information Retrieval (IR) & Machine Translation (MT) Systems. This paper describes an ongoing effort to develop Nepali morphological analyzer, using an open source platform-Apertium (LT-Toolbox). Since, it is the initial stage of this project; we have confined our work to inflectional morphology. So far, we have covered all the possible categories, as per LDC-IL1 POS tag-set of Nepali. Currently, the coverage of Nepali Morph-Analyzer is 20,000 words, classified into 219 paradigms.

Downloads

Download data is not yet available.

Downloads

Published

2012-12-31

How to Cite

Bhat, S. M., & Rai, R. (2012). Building morphological analyzer for Nepali. Journal of Modern Languages, 22(1), 45–58. Retrieved from http://mjs.um.edu.my/index.php/JML/article/view/3297