Contributor
Prathamesh Ghatole

Automate areas management in MusicBrainz


Mentors
reosarevok, Michael Wiencek
Organization
MetaBrainz Foundation Inc
Technologies
python, linux, postgresql, docker, pandas, shell scripting
Topics
automation, data analytics, Web Scraping, Data Engineering, ETL
MusicBrainz is a community-maintained database of music metadata that includes information about music artists, their releases, and related data such as recording and release dates, labels, and track listings, as well as the locations related to these tracks. The database tracks area types like countries, cities, districts, etc. to indicate the location of recording studios, artist birthplaces, concert halls, and events. But considering the scope of the database, MusicBrainz refers to external databases like Wikidata & GeoNames to keep its area metadata up-to-date. However, adding areas to the database is currently a manual process, which is cumbersome, causes delays, and prevents frequent updates on outdated areas. With this GSoC project, we aim to tackle this issue by building a new Mechanize based “AreaBot” written in Python (similar to the old Perl Bot) to automatically maintain and update areas in MusicBrainz using Wikidata.