Source Separation – GSoC 2021 Week 3

Hi all! Here are this week’s updates: 

I spent the past week refactoring so that there’s a generic EffectDeepLearning that we can inherit from to use deep learning models for other applications outside source separation (like audio generation or labeling). I also modified the resampling behavior. Instead of resampling the whole track (which can be useless if we’re only processing a 10s selection in a 2hr track), resampling is done directly on each buffer block via a torchscript module borrowed from torchaudio. Additionally, I added a CMake script for including built-in separation models in the Audacity package. 

Goals for next week

UI work

Because the separation models will be used as essentially black boxes (audio mixture in, separated sources out), I don’t think there’s much I can do for the actual effect UI, except for allowing the user to import pretrained models and displaying relevant metadata (sample rate, speech/music, and possibly an indicator of processing speed / separation quality). 

The biggest user interaction happens when the user chooses a deep learning model. The models could potentially be hosted in HuggingFace (https://huggingface.co/). The models can range anywhere from 10MB to upwards of 200MB in size, process audio at different sample rates, and take different amounts of time to compute. We could have a dedicated page in the Audacity website or manual that provides information on how to choose and download separation models, as well as provides links to specific models that are Audacity-ready. 

Ideally, we would like to offer models for different use cases. For example, a user wanting to quickly denoise a recorded lecture would be fine using a lower-quality speech separation model with an 8kHz sample rate. On the other hand, someone trying to upmix a song would probably be willing to sacrifice longer compute time and use a higher quality music separation model with a 48kHz sample rate. 

Build work

I’ve managed to get the Audacity + deep learning build working on Linux and MacOS, but not Windows. I’ll spend some time this week looking into writing a Conan recipe for libtorch that simplifies the build process for all platforms.