Contributor
Evgenii Glazunov

Bilingual dictionary enrichment via graph completion


Mentors
Mikel Forcada, Kevin Brubeck Unhammer, Francis Tyers
Organization
Apertium

Graph representation is very promising because it represents a philosophical model of a metalanguage knowledge. Knowing several languages, I know that it could be hard to recall some rare word and it is easier to translate from French to English and only then to Russian - because I forgot the word-pair between Russian and French. This graph representation works just like my memory: we cannot recall what is this word from L1 in L2. Hmm, we know L1-L3 and L3-L2. Oh, that's the link we need. Now we know L1-L3 word-pair. So, as we work on natural language processing, let's use natural instruments and systems as well.

The main benefit of this project is reducing human labor and automatization of part of the dictionary development.

  1. Finding lacunae in created dictionary
  2. Dictionary enrichment based on algorithm that offer variants and evaluation of these variants.
  3. A potential base for creating new pairs.

List of main ideas:

  1. Classes to create the most appropriate type of information
  2. Work with subraphs (connectivity components) to reduce the complexity of calculations
  3. Filtration algorithms
  4. Vectorization to increase efficiency
  5. Develope different metrics to reach quality of translation