Contributor
Kamush

Implementing new language pair: Kazakh - Uzbek


Mentors
Sevilay Bayatli, Jonathan W
Organization
Apertium

Having seen the benefits of the open-source Rule-Based Machine Translation platform - Apertium as an alternative to other free/commercial online translator systems, especially for many low-resource language pairs, I decided to contribute to the platform by extending the list of language pairs my native language - Uzbek has so far. Being a master student in philology, and having some experience in the creation of language resources, I would like to propose to implement new language pair: Kazakh - Uzbek for Apertium, as these two languages are both low-resource Turkic languages that are official languages of two respective Central Asian countries with so many economical and cultural relationships. But this language pair still lacks an open-source machine translation system. My proposal is to fill this gap as much as possible during this GSoC2021 program. Since Uzbek and Kazakh languages from the same language family, they are closely related in terms of grammar, word order and similarity in vocabulary, so I will try to make a bidirectional translation, with a more focus on Kazakh -> Uzbek side, as Uzbek is my native language and I possess very basic knowledge in Kazakh.