A free/open-source platform for machine translation and language technology

Technologies
python, c++, xml, fsts
Topics
natural language processing, machine translation, language technology, grammar
A free/open-source platform for machine translation and language technology

There are around 7,000 languages in the world today, around half of which are written. Most language technology is only available for a tiny fraction of these. Certainly under 1%. Apertium is a project which aims to help create language technology, particularly machine translation systems for the other 99%. Because most of the languages we work with have very little in the way of existing translations, we rely on making the most of all kinds of different resources from written grammars to dictionaries, corpus collections and help from native speakers and activists.

2018 Program

Successful Projects

Contributor
Claude Balaguer
Mentor
Hèctor Alòs i Font
Organization
Apertium
Fra-oci/oci-fra translator
I intend to work on a French-Occitan translation pair in order to provide a new translator, which will be useful first to the Occitan community but...
Contributor
Anastasia Kuznetsova
Mentor
Francis Tyers
Organization
Apertium
Adoption of Guarani - Spanish pair
Guarani is one of the most widely spread indigenous languages of southern South America. It is spoken by 6 million people in Paraguay (where it is...
Contributor
Vidyadheesha D N
Mentor
Shardul Chiplunkar, Vinit Ravishankar
Organization
Apertium
Kannada-Marathi language translation
I am adding a new language pair (Kannda-Marathi) to Apertium.
Contributor
Sardana Ivanova
Mentor
Ilnar Salimzianov, Jonathan W
Organization
Apertium
Apertium translation pair for Kazakh and Sakha
I would like to develop Apertium translation pair for Kazakh and Sakha languages. It would benefit society in whole by keeping diversity supporting...
Contributor
Evgenii Glazunov
Mentor
Mikel Forcada, Kevin Brubeck Unhammer, Francis Tyers
Organization
Apertium
Bilingual dictionary enrichment via graph completion
Graph representation is very promising because it represents a philosophical model of a metalanguage knowledge. Knowing several languages, I know...
Contributor
Marc Riera Irigoyen
Mentor
Hèctor Alòs i Font, Xavi Ivars
Organization
Apertium
Adopting the unreleased Romanian-Catalan pair and upgrading other pairs to the monolingual module system
Currently there are no machine translation systems offering direct translation between Romanian and Catalan available to the general public. English...
Contributor
Oğuz
Mentor
Memduh Gökırmak, Gianluca Grossi, Ilnar Salimzianov
Organization
Apertium
Uyghur-Turkish MT
An MT for the closely related Turkic languages, Uyghur of the Karluk branch and Turkish of the Oghuz branch.
Contributor
Anna Kondratjeva
Mentor
Mikel Forcada, Francis Tyers
Organization
Apertium
Improving language pairs by mining MediaWiki Content Translation postedits
The purpose of this proposal is to create a toolbox for automatic improvement of lexical component of a language pair. This toolbox might become a...
Contributor
Anna Zueva
Mentor
Ilnar Salimzianov, Jonathan W, Francis Tyers
Organization
Apertium
Tatar and Bashkir: developing a language pair
The tat-bak language pair already exists in Apertium, but is now in the nursery state. The aim of my project is to develop this language pair, fill...
Contributor
Abinash Senapati
Mentor
Tommi Pirinen
Organization
Apertium
Extend lttoolbox to have the power of HFST
The aim of this project is to implement the support for morphographemics and weights in the lttoolbox transducer. The proposal focuses on extending...
Contributor
kmurphy4
Mentor
Francis Tyers, Jonathan W, maryszmary
Organization
Apertium
UD-Annotatrix
This project aims to extend the functionality of the UD Annotatrix tool. This tool allows researchers to annotate universal dependency trees right...