Anaphora resolution is the problem of resolving references to earlier items in the discourse. This most commonly appears as pronoun resolution where we need to identify the antecedent in the source context. Apertium works with resource-poor languages and the information available isn’t as linguistically complex as parse trees. Hence there is a need for a tool which resolves anaphora using simple linguistic information.

Instead of the current system, which chooses a default male for pronoun resolution, this tool will use linguistic features to assign saliency scores to the possible antecedents. The highest scored antecedent is picked for possessive, reflexive, zero pronouns and long-distance relations like agreement in adjectives. This formalism is language agnostic and the features make use of only POS tags and basic gender and number information. I will test it on Spanish, Catalan, English, Russian, French, etc.

When implemented, this tool will increase the fluency and intelligibility of the Apertium Translation Output of any pair it is used with. It has several interesting future prospects, such as using language specific linguistic features, and general coreference resolution.



Tanmai Khanna


  • Kevin Brubeck Unhammer
  • Francis Tyers