Apertium currently has an English-Catalan language pair in trunk, but there is a lot of room for improvement. One of the existing problems is the use of monodixes not shared with other languages, which makes future development more difficult. The main goal of this proposal is to bring the unreleased eng-cat language pair (which uses shared monodixes) to the same level as the release en-ca language pair in order to get rid of the old pair. In addition, coverage and error rates will be improved by expanding the bidix and adding new transfer and lexical selection rules. The proposal is focused on EN>CA machine translation, but some work will be done to improve the translated texts in the opposite direction.

Organization

Student

Marc Riera Irigoyen

Mentors

  • Mikel L. Forcada
  • Adrià Martín-Mor
  • Xavier Ivars Ribes
close

2017