A free/open-source machine translation platform

Apertium is a shallow-transfer machine translation system, which uses finite state transducers for all of its lexical transformations, and hidden Markov models and/or constraint grammars for part-of-speech tagging or word category disambiguation.

Existing machine translation systems available at present are mostly commercial and use proprietary technologies, which makes them very hard to adapt to new usages; furthermore, they use different technologies across language pairs, which makes it very difficult, for instance, to integrate them in a single multilingual content management system. Finally, most of them are not available for most of the languages in the world, as they rely heavily on resources that are available for only a few languages.

Apertium uses a language-independent specification to allow for the ease of contributing to Apertium, more efficient development, and enhancing the project's overall growth.

At present, Apertium has released more than 40 stable language pairs, delivering fast translation with reasonably intelligible or excellent results depending on the language pair. Being an open-source project, Apertium provides tools for potential developers to build their own language pair and contribute to the project.

lightbulb_outline View ideas list

Technologies

  • c++
  • bash
  • python
  • xml

Topics

  • Other
  • machine translation
  • natural language processing
  • less-resourced languages
comment IRC Channel
email Mailing list
mail_outline Contact email

Apertium 2020 Projects

  • Priyank Modi
    Adopt an unreleased language pair : Hindi-Punjabi
    I plan on developing the Hindi-Punjabi language pair in both directions i.e. hin-pan and pan-hin. This'll involve improving the monolingual...
  • Amirniyaz Mambetniyazov
    Adopting an unreleased language pair of Uzb-> Kaa
    In this project I am going to create a new language pair uzb-kaa. Last year I have helped with translations to GSoC 2019 student,as I am native...
  • Hèctor Alòs Font
    Adopting the French-Arpitan language pair
    I propose to create a bidirectional French-Arpitan translator. Arpitan (often called Franco-Provençal) is an endangered and heavily under-resourced...
  • Shashwat Goel
    Bilingual Dictionary Discovery via Graph Exploration
    A crucial step in developing a language pair is writing its bilingual dictionary, which maps a lemma X in language A to a lemma Y in language B if X...
  • Khalid Alnajjar
    Extending Ve’rdd for Apertium Needs
    This proposal is targeted to the task named "A Web Interface to expanding dictionary lemmas integrate with GitLab/GitHub". As I have already...
  • Tanmai Khanna
    Modifying the apertium stream format and solving the markup reordering problem using wordbound blanks
    Markup handling has been a problem in Apertium for a long time. It was done using superblanks that encapsulate markup information inside them during...
  • Elmurod Kuriyozov
    State-of-the-art Morphological Analayser for Uzbek language and improved language pairs uz-kk, uz-ky, uz-tr.
    Creating the State-of-the-art HFST-based Morphological Analayser for Uzbek language, contributing on the Karakapak and Uyghur Morphological...
close

2020