Natural Language Generation is the process of generating coherent natural language text from non-linguistic data. Though the community has been generally going for speech and text output for these models, there has been far less certainty in the inputs. A large number of inputs have been taken for NLG systems including images, numeric data, semantic representations and Semantic Web (SW) data. Presently, the generation of Natural Language from SW, more precisely RDF data, has gained substantial attention and has also been proved to support the creation of NLG benchmarks. However, most models are aimed at generating coherent sentences in English, whilst other languages have enjoyed comparatively less attention from researchers. RDF data is usually in the form of triples, . Subject denotes the resource, predicate denotes traits or aspects of the resource and expresses the relationship between subject and object.

In this project we aim to create a multilingual Neural verbalizer, ie, generating high-quality natural-language text from sets of RDF triples in multiple languages using one stand-alone, end-to-end trainable model.



Dwaraknath Gnaneshwar


  • Diego Moussallem
  • Thiago Castro Ferreira