DBpedia Spotlight Dashboard: an integrated statistical information tool from the Wikipedia dumps and the DBpedia Extraction Framework artifacts
- Mentors
- Said Polanco-Martagón, Maribel, Beyza Yaman, Julio Hernandez, Jan Forberg
- Organization
- DBpedia
DBpedia Spotlight was released in 2011 by DBpedia. It is a tool that allows to annotate DBpedia resources in text, providing a solution for linking unstructured information sources to the Linked Open Data cloud through DBpedia.
To make this possible, a model must be created for each language using Wikistats (uriCounts, pairCounts, sfAndTotalCounts and tokenCounts), that are obtained from the Wikipedia dump, and the following DBpedia Extraction Framework artifacts: instance types, redirects and disambiguations.
The main idea of the project is to generate a Dashboard that shows statistical information about data collected by DBpedia Extraction Framework and Wikistats. This information will help to have an overview of the existent types of classes, how they are statistically represented (which type of entity is the most common), the trend that exists. In addition, it is intended to add a comparative element between versions of the same language which will help to appreciate changes from one version to another (number of entities, types of entities, trends of each version, etc.). All this information can be used to improve the identification of topics in documents.