Source Separation – GSoC 2021 Week 4

Hi all! Here are this week’s updates: 

Though the focus of the project is on making a source separation effect, a lot of the code written for this effect has shown to be generic enough that it can be used with any deep-learning based audio processor, given that it meets certain input-output constraints. Thus, we will be providing a way for researchers and deep learning practitioners to share their source separation (and more!) models with the Audacity community. 

The “Deep Learning Effect” infrastructure can be used with any PyTorch-based models that take a single-channel (multichannel optional) waveform, and output an arbitrary number of audio waveforms, which are then written to output tracks.   

This opens up the opportunity to make available an entire suite of different processors, like speech denoisers, speech enhancers, source separation, audio superresolution, etc., with contributions from the community. People will be able to upload the models they want to contribute to HuggingFace, and we will provide an interface for users to see and download these models from within Audacity. I will be working with nussl to provide wrappers and guidelines for making sure that the uploaded models are compatible with Audacity. 

I met with Ethan from the nussl team, as well as Jouni and Dmitry from the Audacity team. We talked about what the UX design would look like for using the Deep Learning effects in Audacity. In order to make these different models available to users, we plan on designing a package manager-style interface for installing and uninstalling deep models in Audacity. 

I made a basic wireframe of what the model manager UI would look like:

Goals for this week:

  • Work on the backend for the deep model manager in audacity. The manager should be able to 
    • Query HuggingFace for model repos that match certain tags (e.g. “Audacity”). 
    • Keep a collection of these repos, along with their metadata.
    • Search and filter through the repos with respect to different metadata fields.
    • Be able to install and uninstall different models upon request.