a Large, Multilingual, Semantic Knowledge Graph

Technologies
java, scala, rdf, graph, nosql
Topics
big data, data science, natural language processing, semantic web, knowledge extraction
a Large, Multilingual, Semantic Knowledge Graph

Almost every major Web company has now announced its work on a Knowledge Graph: Google’s Knowledge Graph, Yahoo!’s Web of Objects, Microsoft's Satori Graph, Walmart Lab’s Social Genome, and Facebook’s Entity Graph, just to cite the biggest ones.

DBpedia

DBpedia is a community-run project that has been working on a free, open-source Knowledge Graph since 2006!

DBpedia currently describes 38.3 million “things” of 685 different “types” in 128 languages, with over 4 billion “facts”. It is interlinked to many other datasets. The knowledge in DBpedia is exposed through a technology stack called Linked Data, which has been revolutionizing the way applications interact with the Web: with Linked Data technologies, all APIs are interconnected via standard Web protocols and languages.

The Web of data

Such Web of data provides useful knowledge that can complement the Web of documents in many ways. See, for instance, how bloggers tag their posts or assign them to categories in order to organize and interconnect their blog posts. This is a very simple way to connect unstructured text to a structure (hierarchy of tags). For more advanced examples, see how BBC has created the World Cup 2010 website by interconnecting textual content and facts from their knowledge base.

Or, more recently, did you see that IBM's Watson used DBpedia data to win the Jeopardy challenge?

DBpedia Spotlight

DBpedia Spotlight is an open source text annotation tool that connects text to Linked Data by marking names of things in text (we call that spotting) and selecting between multiple interpretations of these names (we call that disambiguation). For example, Washington can be interpreted in more than 50 ways including a state, a government or a person. You can already imagine that this is not a trivial task, especially when we're talking about millions of things and hundreds of types.

We are regularly growing our community through GSoC and can deliver more and more opportunities to you.

2017 Program

Successful Projects

Contributor
Ram G Athreya
Mentor
Ricardo Usbeck
Organization
DBpedia
First Chatbot for DBpedia - Ram G Athreya
The requirement of the project is to build a conversational Chatbot for DBpedia which would be deployed in at least two social networks. There are...
Contributor
Luca Virgili
Mentor
Emanuele Storti, Domenico Potena
Organization
DBpedia
The table extractor
This is a proposal for the table extractor project. In this paper I explain how to improve the last year's project and to create a general way in...
Contributor
Krishh
Mentor
Emanuele Storti, Marco Fossati, Domenico Potena
Organization
DBpedia
Wikipedia List-Extractor
Wikipedia, being the world’s largest encyclopedia, has humongous amount of information present in form of text. While key facts and figures are...
Contributor
Nausheen Fatma
Mentor
SANDRO ATHAIDE COELHO, Tommaso Soru
Organization
DBpedia
Knowledge Base Embeddings for DBpedia
Knowledge base embeddings has been an active area of research. In recent years a lot of research work such as TransE, TransR, RESCAL, SSP, etc. has...
Contributor
Shashank Motepalli
Mentor
Marco Fossati, Dimitris Kontokostas
Organization
DBpedia
Unsupervised Learning of DBpedia Taxonomy
DBpedia tries to extract structured information from Wikipedia and make information available on the Web. In this way, the DBpedia project develops a...
Contributor
Ismael Rodriguez
Mentor
Anastasia Dimou, Wouter Maroy, Dimitris Kontokostas
Organization
DBpedia
DBpedia Mappings Front-End Administration
Although the DBPedia Extraction Framework was adapted to support RML mappings thanks to a project of last year GSoC, the user interface to create...
Contributor
Akshay Jagatap
Mentor
SANDRO ATHAIDE COELHO, Xu Peng, Tommaso Soru
Organization
DBpedia
Knowledge Base Embeddings for DBpedia - Akshay Jagatap
The project aims at defining embeddings to represent classes, instances and properties. Such a model tries to quantify semantic similarity as a...