Contributor
Miku

Improving Japanese Search Capability: Adding Japanese Morphological Analysis Functionality


Mentors
Sarah Hoffmann, mtmail
Organization
OpenStreetMap
Technologies
python
Topics
morphological analysis
Japanese addresses are unique and have a different system compared to other countries. The address format is in block style, and is different from address formats in other regions. In addition, the language used is Japanese, and English is not commonly used. As a result, Japanese users face many challenges when searching for locations in Japan using Nominatim. In this proposal, I will address the issue of Japanese Kanji being recognized as Chinese characters, resulting in incorrect locations being returned. Specifically, we will conduct morphological analysis on Japanese searches and perform proper word segmentation at appropriate positions. The objectives of this proposal are as follows: Ensure that Nominatim does not mistake Japanese searches for Chinese. Ensure that Japanese location searches return locations within Japan. Ensure that Nominatim correctly identifies the block and house numbers in Japanese addresses, avoiding confusion between the two.