Overview

The event has passed

Date:Starts 9 June 2023, 16:00Ends 9 June 2023, 17:00
Location:
MV:F23, Skeppsgränd 3
Language:English

Abstract: Self-supervised learning has emerged as a promising method for learning informative representations suitable for many machine learning tasks. However, while self-supervised representation learning has been instrumental in various fields, its significance in music information retrieval has only recently gained momentum. This thesis investigates the potential of the VICReg loss function for self-supervised learning in the audio time domain by comparing its performance against the established CLMR model. Following the evaluations performed in CLMR, we train our VICReg model on the publically available Free Music Archive and GTZAN datasets. We then evaluate the learned representation on the downstream task of music classification on the MagnaTagATune dataset by training a linear logistic classifier and a two-layer MLP classifier on the representations generated by a frozen, pre-trained VICReg model. In our transfer learning experiments, VICReg achieves a ROC-AUC score of 89.15 and a PR-AUC score of 35.85 compared to 88.12 and 33.83, respectively, as achieved by CLMR, showing that VICReg demonstrates a competitive performance compared to CLMR. With more robust training and further tuning, we believe that VICReg can achieve superior performance compared to established loss functions for self-supervised representation learning in the audio domain and advocate continued exploration in this direction.

Master's Thesis presentation, Cody Hesse and Sebastian Löf

Overview