This research introduces an Emotion-based Music Recommendation System (EMRS) using Convolutional Neural Networks (CNNs) to analyze facial expressions and recommend music tailored to individual emotional states. Unlike traditional systems, EMRS prioritizes facial expression analysis for personalization. CNNs, trained on a diverse dataset of emotional expressions linked to music, extract key emotional features. EMRS leverages this analysis to intuitively suggest music that can potentially aid in emotional regulation. This research not only advances personalized music recommendation but also opens doors for emotion-aware technology with applications in mental healthcare for emotional imbalance and trauma. EMRS has the potential to serve as a complementary tool for therapists and individuals seeking emotional well-being through music.
Introduction
I. INTRODUCTION
Harnessing the power of facial expressions, this project introduces an emotion-based music Recommendation system. By analyzing user expressions through a webcam, the system recommends music tailored to their mood. This personalized approach aims to improve well-being by leveraging music's ability to influence emotions. Facial expression recognition, a well-established form of emotion analysis, plays a central role in capturing user sentiment. This technology has the potential to not only enhance music selection but also support emotional regulation, potentially aiding individuals experiencing depression or sadness.
II. LITERATURE SURVEY
EMUSE (IRJMETS, 2022): EMUSE is a music recommendation system that leverages facial emotion detection in combination with the Spotify API to offer personalized music suggestions. It boasts a user-friendly interface where users can see their detected emotions alongside recommended songs, enhancing the overall listening experience. However, one of its limitations lies in its sole reliance on facial expressions, potentially overlooking deeper emotional cues that could impact the accuracy of recommendations. Additionally, the continuous analysis of facial features raises concerns about user privacy and the practicality of such a system.
Facial Emotion-based Music Recommender (ResearchGate, 2020): This system utilizes Convolutional Neural Networks (CNNs) for facial emotion detection and subsequent music recommendations. Its use of advanced machine learning techniques enables more nuanced emotion analysis, leading to tailored music suggestions. However, the system faces challenges in accurately capturing subtle emotions, which may affect the reliability of its recommendations. Moreover, continuous facial analysis raises privacy concerns, and the paper lacks detailed information on the evaluation methods used to assess the system's performance.
Emotion Based Music Recommendation (NORMA@NCI Library, 2020): This study explores a music recommendation system that employs CNNs trained on audio features extracted from music for more accurate emotion representation. By analyzing audio signals directly, the system aims to capture the emotional content of music more effectively. However, it relies on Support Vector Machines (SVMs) for music recommendations, which may not be as efficient as other recommendation methods like collaborative filtering. Additionally, the lack of detailed information regarding the CNN architecture and the specific audio features used limits a comprehensive understanding of the system's capabilities.
Music Recommendations System (JES Publication, 2021): This system utilizes CNNs to analyze music based on Mel-Frequency Cepstral Coefficients (MFCCs) for emotion-based recommendations. By directly analyzing audio features, the system offers a unique approach to understanding the emotional context of music. However, the paper lacks specific details about the CNN architecture employed and does not discuss the recommendation strategy adopted by the system, limiting insights into its inner workings and potential improvements.
Emotion Based Music Recommendation System (IJRPR, 2020): This study explores an emotion-based music recommendation system that integrates transfer learning with pre-trained models like ResNet50, SeNet50, and VGG16 for facial emotion detection. By leveraging advanced techniques like ensemble learning, the system aims to improve the accuracy of emotion detection and subsequent music recommendations. However, like other facial detection systems, its reliance on facial expressions may lead to accuracy issues, highlighting the ongoing challenges in this field.
B. Emotion Detection Module
Face Detection: For real-time face detection in computer vision, algorithms (classifiers) categorize image regions as "face" or "not face" using massive training datasets. Popular OpenCV library offers LBP and Haar Cascade classifiers. Haar cascades, trained on diverse facial data, excel at identifying faces despite variations. Face detection aims to pinpoint faces in images/videos, minimizing distractions. This machine learning approach leverages a cascade function trained on data, with Haar wavelets analyzing grouped image pixels for high accuracy.
Feature Extraction: Pre-trained CNNs can be repurposed as feature extractors. We pass the image through the network up to a chosen layer, using its output (feature maps) as extracted features. Early layers capture low-level features with few filters, while deeper layers use more filters to capture complex features but are computationally expensive. This approach leverages the network's learned discriminative features. Feature maps, visualized for each layer, reveal which features were crucial for image classification.
Emotion Detection: Our emotion detection system leverages a Visual Geometry Group (VGG)-16 Convolutional Neural Network (CNN) architecture. This CNN employs filters to detect features like edges within the input image, generating feature maps using the ReLU activation function. Pooling (often max-pooling) is then applied to reduce sensitivity to minor image variations. The flattened feature maps are fed into a deep neural network for classification. This can be binary (e.g., happy/not happy) or multi-class (e.g., identifying multiple emotions). While the learned features are not directly interpretable, VGG-16's architecture has proven effective for image classification tasks like emotion detection. In our system, a pre-trained VGG-16 model, loaded with weights, analyzes user images in real-time to predict and display emotions.
C. Music Recommendation Module:
By using the emotion module real-time emotion of the user is detected. This will give the labels like Happy, Sad, Angry, Surprise, and Neutral. This Recommendation system leverages cutting-edge technology to create a seamless and personalized music discovery experience. It first establishes secure authorization with Spotify to access your music library. Then, utilizing advanced facial recognition, the system analyzes your emotional state in real-time. Whether it's a burst of joy or a moment of quiet reflection, the system intelligently interprets your emotions and curates a tailored playlist from Spotify's extensive collection. These personalized recommendations are conveniently presented on-screen, allowing you to either enjoy them directly within the app by clicking on the recommended song or effortlessly transition to the Spotify platform for further exploration.
IV. RESULT & ANALYSIS
This emotion-based music recommendation system achieves 80% accuracy using a Haar Cascade algorithm for face detection. Following face recognition, a pre-trained deep learning model predicts the user's emotions from six categories: anger, fear, happiness, sadness, surprise, and neutrality.
VI. FUTURE SCOPE
This project lays the groundwork for an emotion-aware music recommendation system. Future advancements could involve refining deep learning models and exploring methods like voice analysis for more accurate emotion detection. User feedback and listening history could personalize the emotion-to-music mapping, while integrating biometric data alongside facial recognition could provide a more holistic emotional picture. Collaboration with mental health professionals could even explore this technology as a tool for managing emotional imbalances and trauma through personalized music therapy. All the while, developing privacy-preserving techniques like federated learning would ensure user privacy remains a priority. By addressing these areas, this project has the potential to revolutionize music recommendation and even contribute to advancements in mental healthcare.
Conclusion
This project explores emotion-based music recommendation using facial recognition. Deep learning (CNNs) analyzes expressions to detect emotions and curate personalized Spotify playlists matching the user\'s mood. Despite a user-friendly interface and vast music library access, limitations remain. Emotion detection accuracy (80%) and emotion-to-music mapping require refinement. Additionally, managing computational resources and user privacy concerns regarding webcam access are crucial. While limitations exist, the project highlights the potential for personalized music experiences. This technology extends beyond entertainment, showing promise in mental healthcare. Music therapy is a tool for managing emotional imbalances and trauma. This system, by offering personalized music based on emotions, could be a non-invasive tool for managing emotional well-being.
References
[1] Phaneendra, Madhusmitha Muduli, Siri Lakshmi Reddy, R. Veenasree, “EMUSE – An Emotion based Music Recommendation System”, IRJMETS, Volume 04, Issue 05 May 2022.
[2] Maduri Athavle, Deepali Mudale, Upasana Shrivastav, Megha Gupta, “Music Recommendation based on Face Emotion Recognition”, ICAI, Vol. 02, Sno. 018, pp. 1-11, Issue 2021.
[3] G. Tirumala, M. Niharika, S. Shailu, M. Manaswini, G. Venga Vinodini, “Music Recommendation System based on Emotions using CNN”, Journal of Engineering Sciences, Vol. 14, Issue 2023.
[4] Ashwini Jadhav, Nikita Bhaise, Mihir Narwade, Ruchita Phalke, Yash Talele, “Emotion based Music Recommendation System using CNN”, International Journal of Research Publication and Reviews, Vol. 4, pp. 3288-3292, Issue May 2023.
[5] Ms. K. Lekha Sree, Mr. P. Praveen Kumar, “Music Recommend System using Facial Emotion Recognition”, IJCRT, Vol. 11, Issue 10 Oct 2023.
[6] Renuka Mokalkar, Uday Gaikwad, Amol Jagtap, “Emotion based Music Recommendation”, International Journal of Creative Research Thoughts (IJCRT), Vol. 11, Issue 5 May 2023.
[7] Vijay Prakash Sharma, Azeem Saleem Gaded, Deevesh Chaudhary, “Emotion based Music Recommendation System”, 2021 9th ICRITO, DOI: 10.1109/ICRITO51393.2021.9596276.
[8] D. Ayata, Y. Yaslan and M. E. Kamasak, \"Emotion based Music Recommendation System using Wearable Physiological sensors\", IEEE Transactions on Consumer Electronics, Vol. 64, no. 2, pp. 196-203, Issue 2018.
[9] J. Zhang, \"Movies and Pop Songs Recommendation System by Emotion Detection through Facial Recognition\", Journal of Physics: Conference Series – IOP science, Vol. 1650, Issue 2020.
[10] Zhiyuan Liu, Wei Xu, Wenping Zhang, “An Emotion-based personalized Music Recommendation Framework for Emotion improvement”, Information Processing & Management, Vol 60, Issue 3 May 2023.
[11] Kevin Patel; Rajeev Kumar Gupta, “Song Playlist Generator System Based on Facial Expression and Song Mood”, International Conference on Artificial Intelligence and Machine Vision, IEEE, Issue 10 Jan 2022.
[12] M G Siddaraj, Avais Ismail, Mohammed Aleem, Sanchitha, “Mood based Music System using Machine Learning Techniques”, International Journal of Advances in Engineering and Management (IJAEM), Vol. 4, Issue 7 July 2022.