Social media platforms like Facebook, Twitter, Instagram, and Reddit have had a profound and lasting impact on our society. They have increased connectivity among people and allowed individuals to create a digital persona. While social media has many positive aspects, there are also undeniable drawbacks. Recent research has shown a link between excessive social media use and higher levels of depression. The objective of this project is to identify depression by analysing audio, text, or recordings. The study utilizes machine learning techniques to detect potential signs of depression in users based on their network behaviour and tweets. To accomplish this, classifiers were trained and tested using features extracted from users\' activities on social media networks and their tweets. The keywords associated with this study include social media, mental health, depression, network behaviour, and tweets.
Introduction
I. INTRODUCTION
Depression is a prevalent mental illness that has a significant impact on global disability rates and can even lead to suicides. It is estimated that more than 300 million people worldwide experience depression annually. Typically, depression is diagnosed using in-person clinical criteria. However, a large percentage of individuals, approximately 70% in the early stages of depression, do not seek professional help, resulting in the potential progression of their condition. In recent times, there has been a growing movement to utilize social media data to detect, estimate, and monitor the occurrence of diseases, including depression. The widespread use of social media presents an excellent opportunity to gather valuable data for mental health professionals and researchers, thereby enhancing their knowledge and equipping them with better tools to address mental health challenges.
II. RELATED WORK
The paper titled "Depression Detection using Emotional Artificial Intelligence" by Mandar Deshpande and Vignesh Rao, published in the proceedings of the International Conference on Intelligent Sustainable Systems of the Institute of Electrical and Electronics Engineers (IEEE), focuses on utilizing emotional analysis to identify depression. Another paper titled "A review and meta-analysis of machine intelligence approaches for mental health issues and depression detection" authored by Ravita Chahar and Ashutosh Kumar Dubey, presented at the International Journal of Advanced Technology and Engineering Exploration, discusses the use of machine learning algorithms for analyzing social media data and developing a system capable of detecting depression.
B. Proposed System
This project introduces a novel approach that utilizes machine learning to detect depression. It involves a web-based application where users can register and login to determine their depression status. The application offers three options: uploading a file containing audio or text, or recording directly. When using the recording option, users record their voice, and the system detects depression based on the recording. For text input, the data undergoes preprocessing and cleaning, removing unnecessary words and analyzing the frequency of depressed terms such as "depressed" or "suicidal." Based on these words, the system assigns a label indicating whether the speech is depressed or non-depressed. To achieve this, various algorithms like KNN, adaboost classifier, and naïve bayes are employed to accurately identify relevant words and detect depression.
Advantages of this Approach Include
It is beneficial for individuals who may hesitate to openly discuss their feelings.
The application can be accessed by anyone, anywhere with internet access, and it requires minimal cost or can even be free to use.
C. Proposed Algorithm
Logistic Regression: Logistic regression is a classification algorithm in machine learning that predicts the probability of specific classes based on certain independent variables.
Random Forest: Random Forest is a popular supervised machine learning algorithm used for classification and regression problems. It combines multiple decision trees to make predictions.
Adaboost Classifier: AdaBoost is a technique that can enhance the performance of any machine learning algorithm, especially weak learners.
Stochastic Gradient Descent (SGD): SGD is an efficient optimization algorithm used for fitting linear classifiers and regressors, particularly in convex loss functions like logistic regression and support vector machines.
K-Nearest Neighbour (KNN): KNN is a non-parametric, supervised learning classifier that makes predictions based on the proximity of data points to their neighbors.
Decision Tree: A decision tree is a specific type of probability tree used for making decisions about a process or problem.
Naïve Bayes: Naive Bayes classifier is an algorithm that uses Bayes' theorem to classify objects, assuming strong or naive independence between the attributes of data points.
IV. RESULTS
A. Output Screens
Conclusion
The research paper introduces a method for detecting depression using audio, text, or recordings. This system is highly beneficial for early depression diagnosis, providing the convenience of a smartphone. By analysing the input, it effectively identifies and classifies speech as either depressed or non-depressed. Moreover, it transforms the application into an excellent resource for early detection, allowing easy identification of warning signs like reduced activity, mood fluctuations, appetite changes, and behavioural shifts through the user-friendly interface of a smartphone.
References
[1] Shen et al. (2017) presented a method for detecting depression by analyzing social media data through multimodal dictionary learning. Their study was published in the International Joint Conference on Artificial Intelligence (IJCAI), where they focused on harvesting social media content for depression detection.
[2] Mowery et al. (2017) conducted feature studies using Twitter data to classify depressive symptoms for population health. Their research, available as a preprint on arXiv, aimed to inform the development of classification methods based on Twitter data.