Finite state automata/ transducers are currently used in lots of application including machine translation. One of the most challenging parts of developing models based on transducers is how to weight the edges so that finally a certain path is favored over the other.

The obvious technique is to build a manually annotated corpus, use it to estimate probabilities and then apply these estimates to the edges as weights.However, building a large annotated corpus is in most cases a tiresome job. Therefore, the project aims at generating these weights using only a set of raw corpora based on unsupervised techniques.



Amr Keleg


  • Nick Howell
  • Flammie
  • Francis Tyers