The aim of this project is to enhance the DBpedia Knowledge Base by enabling the model to learn from the corpus and generate embeddings for different entities, such as classes, instances and properties. While we do this, it is imperative that these embeddings are able to accommodate the semantic relatedness between entities. This means that we are not limiting ourselves with just the similarity between words, instead we take a step further ahead to also define the relatedness between the vectors and thus the relation between the entities and the text. Therefore, to incorporate this measure of the semantic distance, we define a measure of descriptiveness of the class that these entities belong to. Entities belonging to a class that has a very high level of description must have very low semantic distance in our model. Eventually, we extend the usability by predicting embeddings for out-of-vocabulary entities as well, and also extract relations between those entities using approaches that have been previously used for link prediction tasks in machine learning.



Bharat Suri


  • Tommaso Soru
  • Thiago Galery
  • Peng Xu