A free/open-source machine translation platform

Technologies
python, c++, xml, bash
Topics
natural language processing, machine translation, less-resourced languages
A free/open-source machine translation platform

Apertium is a shallow-transfer machine translation system, which uses finite state transducers for all of its lexical transformations, and hidden Markov models and/or constraint grammars for part-of-speech tagging or word category disambiguation.

Existing machine translation systems available at present are mostly commercial, use proprietary technologies, which makes them very hard to adapt to new usages; furthermore, they use different technologies across language pairs, which makes it very difficult, for instance, to integrate them in a single multilingual content management system. Finally, most of them are not available for most of the languages in the world, as they rely heavily on resources that are not available for them.

Apertium uses a language-independent specification, to allow for the ease of contributing to Apertium, more efficient development, and enhancing the project's overall growth.

At present, Apertium has released more than 40 stable language pairs, delivering fast translation with reasonably intelligible results. Being an open-source project, Apertium provides tools for potential developers to build their own language pair and contribute to the project.

2019 Program

Successful Projects

Contributor
Daniel Swanson
Mentor
Mikel L. Forcada, Jonathan W, Francis Tyers
Organization
Apertium
Recursive Transfer
Build a GLR parser-generator as an alternative to the current chunking system to better support long-distance phrasal reordering.
Contributor
Tanmai Khanna
Mentor
Kevin Brubeck Unhammer, Francis Tyers
Organization
Apertium
Anaphora Resolution
Anaphora resolution is the problem of resolving references to earlier items in the discourse. This most commonly appears as pronoun resolution where...
Contributor
Sharapat Kalabaev
Mentor
Ilnar Salimzianov, Jonathan W
Organization
Apertium
Develop a releasable Uzbek-Qaraqalpaq translation pair
In this project I am going to create a new translation pair between Uzbek and Qaraqalpaq. There is no other single translator between these two...
Contributor
Aboelhamd Aly
Mentor
sevilay bayatli, Francis Tyers
Organization
Apertium
Improve/Extend weighted transfer rules module
Ambiguous patterns are ones that more than one transfer rule could be applied to. Apertium resolves this ambiguity by applying the left-to-right...
Contributor
Oğuz
Mentor
Memduh Gokirmak, Sardana Ivanova, Ilnar Salimzianov, Jonathan W
Organization
Apertium
Turkic MT improvements
Refining four Turkic MTs: uig-tur, kyr-tur, uzb-tur and tat-tur
Contributor
vaydheesh
Mentor
Sushain Cherivirala
Organization
Apertium
Python API/library for Apertium
Apertium is a free/open-source rule-based machine translation platform implemented in C++. Right now, the project is calling Apertium binaries as...
Contributor
Eden-Grace Muamba
Mentor
Mikel L. Forcada, Anastasia Kuznetsova, Jonathan W
Organization
Apertium
English-Lingala language pair
An ‘English-Lingala’ language pair using Apertium rule-based machine translation system.
Contributor
Amr Keleg
Mentor
Nick Howell, Flammie, Francis Tyers
Organization
Apertium
Unsupervised weighting of automata
Finite state automata/ transducers are currently used in lots of application including machine translation. One of the most challenging parts of...
Contributor
Alyaxey Yaskevich
Mentor
Jonathan W, Francis Tyers, maryszmary
Organization
Apertium
Improvement of Annotatrix project
Bug fixes and feature implementations for Annotatrix tool
Contributor
Hèctor Alòs Font
Mentor
Marc Riera, Francis Tyers, Xavi Ivars
Organization
Apertium
Improving the Catalan-Italian and Catalan-Portuguese language pairs
In this project there are two major goals: 1) improving the existing translators from Italian to Catalan, from Portuguese to Catalan and from Catalan...