It has been an exciting week for me, after the completion of the first brush tool prototype! Currently each windowed sample will be passed via SpectrumTransformer
into the newly added SpectralDataManager
, it checks against the previously selected data and zeros out all the selected frequency bin. (Link to the commit)
The brush tool demo
I have chosen “Killing me softly” by Roberta Flack (one of my all-time favorites!), snippet of four seconds has been extracted from the beginning. I have also added a meow sound to it since we all love cats and more importantly, it consists of pitch variation which cannot be effectively selected by the current tool (horizontal line)
The pitch variation cannot be completely selected by current tool (Yellow line)
To use the brush tool, we simply dragged through the meow sound and its overtones, and click the apply button afterwards, then the selected frequency content will be zeroed out.
Selected frequency bins and its overtones as shown in yellow frame) The selected area has been modified (zeros out)
The full demo video is available here (with before v.s. after audio comparison):
https://drive.google.com/file/d/1bQJGncHWj_GqD19LOPeEp_og3j70akw8/view?usp=sharing
What’s next?
This is still, rather an early stage for this new feature, there are lots of potential improvements. For instance, we can definitely do better than zeroing out all the selected frequency bins, like average sampling from the non-selected windows (horizontally) or frequencies (vertically), or both!
Moreover, I would also like to make the selection smarter. For photo editing, say we were to remove or isolate subject from the background image, we would have prioritized and relied on tools like magic wand for picking up most of the desired area for us intelligently, then followed by the fine tuning using drawing tool. Being said, I hope that the tool will be able to guess and pick up user’s selection (or at least most of them), then the user can add/remove spectral data from the edges using brush tool.
A step even further will be picking up the overtones automatically for the user, during the “magic wand” stage. However, the overtones can be a bit tricky to calculate, since their shapes are kinda skewed in linear view and we need to take logarithmic scale as reference when performing the computation (User can edit in logarithmic view but we cannot easily “select view” for the computation). Without the introduction of advance area approximation algorithm, a possible way can be sliding the fundamental frequency area across the frequency bins that are close to its multiples, then we can estimate and spot the overtones by calculating their spectral energy similarity.
Linear: The shape seems to be “distorted” with higher freq. Logarithmic: The overtones and f0 are more similarly-shaped