Contributor
Ioannis Daras

Addition of Greek language to Spacy.io


Mentors
Panos Louridas, Markos Gogoulos
Organization
GFOSS - Open Technologies Alliance

I propose the addition of Greek language in spacy.io and the implementation of an extra feature of sentiment analysis for the Greek language. On the subject of the integration of Greek language to the spacy.io platform, I suggest the following metrics for the evaluation of the project: First of all, we should ensure that the model of Greek language passes successfully the language-independent "tokenizer sanity" tests provided by spacy.io . A second metric could be the performance of the model in language specific tests. Last but not least, we could evaluate the model in real world data offered by Official Greek Government's Gazette (FEK-ΦΕΚ) for named entities extraction and document categorization as mentioned in the ideas list of GFOSS for GSOC 2018. With respect to sentiment analysis, I would like to implement a binary classifier that, given a piece of text in Greek language, can computationally identify and categorize the opinions expressed and more specifically to determine whether the writer's attitude towards the topic is positive or negative. The classifier will also return a polarity score which will serve as a measure of confidence of the classifier for its’ decision.