Contributor
Eiji Miyamoto

Tokenization for spaceless orthographies in Japanese


Mentors
Kevin Brubeck Unhammer
Organization
Apertium
Technologies
python, c++, xml
Topics
machine learning, nlp
Investigating the suitable tokenizer for east/south Asian languages which usually do not use spaces and implementing it. Besides, improving Japanese-related files.