Free/open-source rule-based machine translation

Technologies
python, c++, xml, nlp
Topics
natural language processing, machine translation, lesser-resourced languages
Free/open-source rule-based machine translation

Apertium is a free/open-source machine translation platform, initially aimed at related-language pairs (such as Spanish–Catalan) but which has been expanded to deal with more divergent language pairs (such as English–Catalan and even Basque→English). The platform provides

  • a language-independent machine translation engine
  • tools to manage the linguistic data necessary to build a machine translation system for a given language pair and
  • linguistic data for a growing number of languages and language pairs
2017 Program

Successful Projects

Contributor
maryszmary
Mentor
Filip Ginter, Jonathan W
Organization
Apertium
UD-annotatrix
The aim of my project is to create an easy-to-use, quick and interactive interface tool for UD annotation based on the existing Apertium project. The...
Contributor
Anna Kondratjeva
Mentor
Mikel L. Forcada, Francis Tyers
Organization
Apertium
Implementing a shallow syntactic function labeller
In many pairs it is useful to know in addition to the morphological tags of a word, syntactic function tags in order to make an adequate translation....
Contributor
Vinay Singh
Mentor
Mikel L. Forcada, Kevin Brubeck Unhammer, Tino Didriksen
Organization
Apertium
Automatic blank handeling
Our current handling of formatting/markup (HTML, odt, docx, latex) is brittle, requiring transfer rules to explicitly deal with blanks (e.g. markup),...
Contributor
lylax47
Mentor
Robert Reynolds
Organization
Apertium
Development of the Czech to Russian Language Pair
I plan to assist in the development of the Czech to Russian language pair in order to bring the Czech to Russian translation capabilities to release...
Contributor
mono
Mentor
Sushain Cherivirala, Xavier Ivars Ribes, Jonathan W
Organization
Apertium
Improvements to the Apertium Website Interface
Apertium is a free/open-source platform for rule-based machine translation and language technology which is aimed providing support for...
Contributor
Memduh Gökırmak
Mentor
Ilnar Salimzianov, Jonathan W, Francis Tyers
Organization
Apertium
Crimean Tatar-Turkish MT
Creating a new rule based translation pair between Crimean Tatar and Turkish. This involves disambiguation, transfer and lexical selection.
Contributor
Irene Tang
Mentor
Mikel L. Forcada, Jonathan W
Organization
Apertium
Discontiguous Multiwords
Discontiguous multiwords are multi-word expressions that are separated by something in the middle (e.g. "take the garbage out") . Apertium currently...
Contributor
Vasilisa Andriyanets
Mentor
Michael Dunn, Maria Pupynina, Francis Tyers
Organization
Apertium
Chukchi morphological analyser using HFST
Chukchi is a language with rich and complicated morphology and incorporation. By now morphological parsers using regular expressions were not able to...
Contributor
Gianfranco Fronteddu
Mentor
Hèctor Alòs i Font, Adrià Martín-Mor, Xavier Ivars Ribes
Organization
Apertium
“Proposal apertium cat-srd and ita-srd”
Catalan to Sardinian (apertium-cat-srd): The project has already started and currently the bidix is in the Staging section. The Catalan language is...
Contributor
Marc Riera Irigoyen
Mentor
Mikel L. Forcada, Adrià Martín-Mor, Xavier Ivars Ribes
Organization
Apertium
Adopting English-Catalan language pair to bring it close to state-of-the-art quality
Apertium currently has an English-Catalan language pair in trunk, but there is a lot of room for improvement. One of the existing problems is the use...