O‘zbek tilidagi mantlarni avtomatik morfologik tahlil qilishda lemmatizatsiya va stemming jarayoni

Manzura Abjalova

Авторы

Manzura Abjalova

Ключевые слова:

Uzbek language, morphological analysis, natural language processing, NLP, lemmatization, stemming, Information Retrieval techniques, sound changes, lemma, stem, normal form, dictionary form, basis, rule-based method, dictionary method, dictionary-free method, stochastic, method

Аннотация

In the field of natural language processing, the stages of graphematic
analysis (tokenization), morphological analysis (lemmatization
and stemming), syntactic analysis (parsing), and semantic analysis are
important for almost all areas of NLP. Many software programs can be
created from natural language that has been remade for digital technology.
In NLP, the lemmatization and stemming technologies of morphological
analysis are common to all languages, and they determine the normal
form of word forms in the dictionary. Although the task of lemmatization
and stemming is the same, they differ in terms of output. While stemming
is valuable as a quick process, lemmatization is important because it
provides a precise linguistic result. Although inflectional lemmatization
was originally intended for inflectional languages, it is now also used for
agglutinative languages. Both technologies are important in processing
the Uzbek language. This article describes the similarities and differences
between lemmatization and stemming, the use of both technologies in
the Uzbek language, and the difference between the term "morphological
analysis" in NLP and Uzbek linguistics.

O‘zbek tilidagi mantlarni avtomatik morfologik tahlil qilishda lemmatizatsiya va stemming jarayoni

Авторы

Ключевые слова:

Аннотация

Загрузки

Опубликован

Выпуск

Раздел

Язык