Music Recommendation Using Deep Learning: A Study

Authors: Shubham Kedari, Jui Walimbe, Madhuri Thorat, Sakshi Warule

DOI Link: https://doi.org/10.22214/ijraset.2022.42612

Abstract

In this project we suggest songs to a user based on his personal favorite music by analyzing their composition. We take advantage of the Convolutions Neural Network (CNN) to learn meaningful audio explanations and suggest similar songs to the user. To suggest a song for an individual user by his preferences. We have studied some deep learning model which will improve the recommendation accuracy over traditional methods like Collaborative Filtering(CF), Convolutional Neural Network(CNN), Recurrent Neural Network (RNN). In which CNN showed promising results and we have decided to use CNN for our personalized music recommender.

Introduction

I. INTRODUCTION

There are millions of songs available but choosing the songs that you like is hard. It depends on your music taste, what you like might not like other people. Having a list of songs specifically suggested for you is really a requirement, with modern technology and the internet era it is possible to create a personalized music recommendation system which will suggest songs that are most likely to be loved by you. Learning from users' favourite songs can lead to many effective music suggestions. To compare and analyse the suggested songs with traditional methods. To suggest a song for an individual user by his preferences. We have studied some deep learning model which will improve the recommendation accuracy over traditional methods like collaborative filtering

II. RESEARCH METHODOLOGY

Our general approach is divided into 2 parts. First We extract the music features 2nd Music genre identification

A. Extracting Music Features

We have parts of the song, cut down to 30 seconds chunk and we create Mel-Frequency Spectrograms.

Basically get Mel-Frequency Spectrogram by mapping the audio signal from the time domain to the frequency domain using the fast Fourier transform, and then performing this on overlapping windowed segments of the audio signal.

Then converted the y-axis (frequency) to a log scale and the color dimension (amplitude) to decibels to form the spectrogram. and mapped the y-axis (frequency) onto the mel scale to form the mel spectrogram.

Below is the example of a Mel Spectrogram

To get rid of all this hassle we used a poweful python based library called librosa which have built in module to convert Audio signals into Mel-scale Spectogram. this spectogram data is stored locally in JSON file for training the model.

B. Personalized CNN Model

We create a personalized model for a user to recommend similar songs based on users selected favorite songs.

We have used a 3 layer convolutional neural network

We have taken 10 second-clips of audio for Input data. And obtained log compressed Mel-spectrograms containing 128 frequency band then converted it into a 130 × 13 × 1 matrix-representation for each 10 second clip. We saved this MFCC data into a json file. And feed it to our 1st layer with 32 filters/ kernel with grid size of 3×3 and ReLU as activation Function To downsample the input we have used Max pooling with pool size of 3×3 with strides 2×2 and zero padding. And used normalized current layers activation with BatchNormalization

2nd layer also has same parameters as 1st.

3rd layer has 32 filters/ kernel and grid size of 2×2 and ReLU as activation Function To downsample the input we have used Max pooling with pool size of 3×3 with strides 2×2 and zero padding. And used normalized current layers activation with BatchNormalization.

So we flattened the output of 2d array into 1d array and feed it to our dense layer of 64 neurons with ReLU activation funclion.

To avoid overfitting we use a dropout of 30 %

Finally we have dense layer of 10 neurons for our 10 genres of music and we used softmax activation to limit the output between 0 to 1 to give us direct probability of each neuron

III. RESULTS AND DISCUSSION

With training over 100 songs (10 songs for each 10 genre) we got the accuracy of up to 80 % and error of 0.53 % we can get more accuracy if we provide more sample data to train

Mel-Frequency spectrogram is used in 100+ papers about music information retrieval system and showed promising results.

CNN showed promising results in learning music feature representation.

Using genre as music classifier helps to identify which type of song it is, Hence it is easier to recommend same type of song to a user.

Our CNN Model showed promising result of 80% accuracy with 0.53% loss

We can further improve our recommendation by comparing individual songs in same genre and giving is similarity score. Highest score result will be shown in top to give user a greater recommendation.

References

[1] Determining Song Similarity using Deep Unsupervised Learning Brad Ross (bross35), Prasanna Ramakrishnan (pras1712),CS229 Final Project Report (category: Music & Audio),2017. [2] Music Recommendation System Using CNN Amala George, Silpa Suneesh, S. Sreelakshmi, Tessa Elizabeth Paul UG Students, Department of Computer Engineering, College of Engineering, Chengannur, Kerala, India IJIRSET,Volume 9, Issue 6, June 2020 [3] Using PCA and K-Means to Predict Likeable Songs from Playlist Information Caroline Langensiepen, Adam Cripps, Richard Cant School of Science and Technology UKSim-AMSS 20th International Conference on Modelling & Simulation,2018 [4] A Personalized Music Recommendation System Using Convolutional Neural Networks Approach Shun-Hao Changa, Ashu Abdula, Jenhui Chena,b,c,*, Hua-Yuan Liaoa Proceedings of IEEE International Conference on Applied System Innovation 2018 IEEE ICASI 2018- Meen, Prior & Lam (Eds) [5] \"Music Genre Classification Using Independent Recurrent Neural Network,\" W. Wu, F. Han, G. Song and Z. Wang, 2018 Chinese Automation Congress (CAC), 2018 [6] \"A Personalized Next-Song Recommendation System Using Community Detection and Markov Model,\" K. Zhang, Z. Zhang, K. Bian, J. Xu and J. Gao, 2017 IEEE Second International Conference on Data Science in Cyberspace (DSC), 2017 [7] \"A Music Recommendation System Based on logistic regression and eXtreme Gradient Boosting,\" H. Tian, H. Cai, J. Wen, S. Li and Y. Li, 2019 International Joint Conference on Neural Networks (IJCNN), 2019 [8] Automatic playlist generation using Convolutional Neural Networks and Recurrent Neural Networks Rosilde Tatiana Irene, Clara Borrelli, Massimiliano Zanoni, Michele Buccoli, Augusto Sarti 27th European Signal Processing Conference (EUSIPCO),2019

Copyright

Copyright © 2022 Shubham Kedari, Jui Walimbe, Madhuri Thorat, Sakshi Warule. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET42612

Publish Date : 2022-05-13

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here