This project proposes the design and integration of a deep learning framework into Audacity, with the focus of developing an interface for end users to interact with state of the art (SOTA) audio source separation models. Although the vast majority of SOTA source separation algorithms are built using Python-based frameworks, PyTorch’s C++ torchscript API provides an elegant solution for deploying python models in C++ applications. In Audacity, this would introduce the opportunity for the user to choose from a collection of domain-specific pretrained models, each suited for a different task (e.g. speech or music). Additionally, it would set the foundational code for future integration of deep models designed for different tasks, such as speech recognition and audio classification. The goal of this work is not only to integrate source separation algorithms into Audacity, but more importantly to lay the groundwork for an extensible suite of open-source, deep learning-based music information retrieval (MIR) tools hosted by the world’s most popular free and open-source audio editor.



Hugo Flores Garcia


  • Dmitry Vedenko