Mental Health Chatbot and Mental Illness Identification System

Authors: Associate Prof. Dr Bharati .P. Vasgi, Chetan Urkudkar, Isha Ghaisas, Tanvi Karale, Padmaja Lole

DOI Link: https://doi.org/10.22214/ijraset.2022.46791

Abstract

In this advanced world today, the Internet has changed the way we communicate with one another which has led to an increase in mental disorders. Mental illness has been considered taboo since ages. The main causes of Mental disorders are : abuse, deadlines, medications, loneliness, relationships, conflicts, genetics, major diseases, etc. People are afraid of discussing their issues with others because of the fear of getting judged. So they decide to go to a psychiatrist. But the fees of psychiatrists are very high and common people cannot afford a full course. So this is the perfect situation for artificial Intelligence to come into action. People can happily share their symptoms and problems with Chat-bot and get satisfactory answers. In this paper, we proposed an intelligent social system which predicts the mental disorder of a user. Users will go through a few questionnaires. As per the responses provided by the user, a score will be generated. This score will indicate the stress, anxiety and depression level of the user. Further we intend to recommend a psychiatrist or psychologist on the basis of the user’s location so that the user gets professional assistance and tools to overcome the mental illness.

Introduction

I. INTRODUCTION

Machine Learning is an advanced highly intelligent system that targets to build a system that can improve via experience with the help of statistical and probabilistic data.

In simple words, it is a remarkably very useful tool which helps us in predicting mental disorders. It also lets researchers, doctors, professors and even students get important information from raw data.

This also helps to build-up personalized experiences, and further helps to make automated intelligent machines/systems. The broadly used algorithms in the world of Artificial Intelligence and Machine Learning are: Random Forest, Support Vector Machine, and Neural Networks.

These have been proven to predict and categorize the upcoming events.

In this paper, the main aim is to come up with a systematic literature review, analytical review, and abstract of the ML techniques that are used to understand what factors are responsible for mental disorders.

This way, it'll also give rise to the challenges and new limitations of using ML techniques in this field. Apart from these things, many good chances and gaps in this area for future research can also be discussed.

This paper therefore came up with an abstract summary and potential research method that could guide researchers, doctors, professors and even students to obtain knowledge about the methods and use of Ml in mental disorder prediction.

DASS, the Depression Anxiety Stress Scales, consists of 42 self-report questionnaires that should be finished over five to ten minutes, each indicating a negative emotional symptom.

Every question is rated on a scale of 0 to 4 Likert Scale.

II. LITERATURE SURVEY

III. METHODOLOGY

In this paper, we proposed a system which identifies someone’s anxiety, stress and depression. So basically, here, we are taking Age, Gender, Education, Urban, Hand (left-handed or right-handed), Religion, Sexual Orientation, Race, Voted, Marriage Status of person into account for prediction. Further in the survey, users are asked several questions and users are supposed to rate it between 0 to 4, Example - ‘no ans’ means 0, ‘not at all’ means 1, ‘sometimes' means 2, ‘many times’ means 3, ‘most times’ means 4. And thus, based on analysis we intend to predictAnxiety.

The main motive of DASS is to separate and find out aspects of emotional disturbance. Let's take an example - to evaluate the degree of severity of the main symptoms of anxiety, stress or depression. The beginning aims of the scale's building were to describve the full range of main symptoms of anxiety and depression, meet harsh standards of psychometric adequacy, and build maximum discrimination between the anxiety depression scales. While the DASS can be managed and scored by individual users without psychology qualifications, it is suggested that the decisions and interpretation based on the outcomes are made by a skilled Psychiatrist in combination with other types of assessment.

So, at first, we imported Python libraries such as pandas, numpy, plotly, seaborn, plotly, matplotlib. Next step is data importing and data cleaning. Data imported from GitHub repository. Next is the Data Cleaning part:

In DASS, there is a maximum limit of 3 for giving wrong answers. Based on our discretion, people who gave more than 2 wrong answers will be flagged red, and will get removed from the survey.
In DASS, timing matters a lot. So, in the data cleaning part, We removed people who took the survey too quickly or slowly. The User must take moderate time in each question.
Creating bin groups for ages is very important, because the age variable is by default continuous in nature. But, if we create bin groups, agre variables get transformed into ordinal data which further makes it easier to classify groups.

So as per survey, here are visualizations of some parameters:

In terms of Gender, Over 3 quarters of the data out of ~35k observations come from females. Some concerns of sampling bias are raised.

Each user was supposed to rate asked questions between 0 to 4. Example - I found myself getting upset by quite trivial things, I was aware of dryness of my mouth, I found it difficult to relax, etc. So as per ratings provided by users, Visualization of all questions with responses were generated. Detail Visualization can be checked here - https://bit.ly/DepressionAnxietyResults .

If we check responses of each question in detail, responses were very mixed. Like - I found myself getting upset by quite trivial things. A trend appears for most questions, where younger people tend to consistently score higher than older people for the DASS survey. And also in terms of Gender, It was observed that women often score higher as compared to men in DASS for nearly every single question. Another crystal clear and regular pattern, a person having higher education generally made up low in DASS compared to a person which has lower education. This indicates that people who are smart or who have higher IQ are generally more depressed as compared to others. It is also observed that it also does not control for age. Person who has high education seems to be older as well, so perhaps we can say, it is really just age that is manipulating the differences.

In terms of married and unmarried it was observed that married people seem to have higher DASS scores regularly as compared to unmarried. But this doesn't completely justify. It was also observed that married people are generally older than unmarried. Our group analysis shows results that there is a major relation between age and DASS scores. Users who are married earlier (widowed or divorce) make up high in DASS scale. They are generally ordered even if compared to currently married people, this seems to be a genuine cause of increment in DASS.

The DASS report is likely to most correspond with TIPI4 (easily upset, Anxious) personalities, and most inversely correlated with TIPI9 (emotionally stable, calm) personalities. Other higher correlated personality types are:

TIPI1 and TIPI5 (openness to new experience, extraversion)
TIPI1 and TIPI6 (reserved quiet, extraversion)
TIPI3 and TIPI9 (emotionally stable, self-disciplined)
TIPI5 and TIPI9 (emotionally stable, openness to new experience)

As anticipated, age is more strongly corresponded with education, voting, and marriage. A thing to notice is that a correlation heatmap is inadequate to make any unquestionable interpretation between DASS score when compared with a variable, and works only if both of the variables when comparing against and your dependent variable are continuous in nature. It is especially horrible when its variables vs ordinal scale.

IV. RANDOM FOREST REGRESSOR

As we know that our target variable is a continuous variable which is also called as mean score, so therefore we cannot use classification models and we should use Regression here.

Model overwhelmingly supports continuous variables rather than categorical variables; this is because of the nature of the model itself. Age group, which was majorly noticed to play a vital role in a person's DASS report is weakened by the continuous age variable. This is totally alright, because, at first the intention of accumulating the age variable in the first place is to decrease the issues of visualization. Now, that is just reading, they can use continuous age variables in order to obtain a better fit. The problem with regression is, they don't perform too well when there are many categorical variables. Unless the categorical variable is very crystal clear, they don't top high on the feature importance chart. If we carefully notice, we will observe that the feature importance chart nearly resembles the results of the correlation map.

Accuracy score for the training set as well as testing set has a huge division, suggesting an overfitting problem during our training, and later suggesting that we must look at the results of our correlation heatmap and feature significance with a grain of salt. The challenge of this issue is that we find out the average final score of the user's survey based on a load of categorical variables. Not only these, whatever answer we get from each survey question is ordinal in nature. So even if people with similar situations/conditions may not answer the survey likewise and is subjective to their survey answering behavior. The random forest regression done here is a proof of concept that an ML-algorithm could be done, but just not a dependable method at solving the problem.

V. RASACHATBOT

Here, In this section. We have used Rasa NLU for entity extraction and intent classification. Rasa NLU executes Natural Language understanding. where, it takes input in form of text and then converts it into structural format data and then later Rasa Core carries out Dialog Management, this monitors the conversation between user and bot and determines how to go on with conversation. Here Artificial and Machine Learning played a vital role which is indeed used by NLU and Rasa Core to get trained from real life conversations.

VI. RASANLUAND CORE

Here's an Outline Summary of our Rasa Open-Source architecture. In Rasa, we have 2 main components, and those are: Dialogue Management and Natural Language Understanding Shortly known as NLU.

NLU Component operates things like entity extraction, response retrieval, intent classification. This is Displayed as a NLU Pipeline after this comes Dialogue Management which determines the next task/action to be executed in a conversation with the user on the basis of context.

Inside Rasa NLU Pipeline: In Rasa, the file named "config.yml" is the place where NLU Pipeline is set out. This File has all those required steps inside the pipeline which will be beneficial and will be used by Rasa to differentiate the intents and execute the suitable actions. All those components and things which we used for our ML project are explained in detail below. Whitespace-Tokenizer: here, the component Tokenizer uses the White Space as a divider/separator and further it creates an index or token for every whitespace separated character sequence.
RegexFeaturizer: It creates a record of regular expressions which are further described in training data setup during training. For each and every regex, our RegexFeaturizer develops a trademark which will act as a benchmark to cross verify if the given expression is present in the user message or not. Then Further, all these trademarks will be given into an entity extractor or intent classifier to make classification easy (by taking into consideration that the classifier got trained during the learning/training phase). For now, only DIETClassifier and CRFEntityExtractor components are compatible with the Regex trademark for entity extraction.
Count-Vectors-Featurizer: This develops bag-of-words display of responses, user messages and intents and develops trademarks for response selection and intent classification.
HF-Transformers-NLP: this makes use of language models which are pre-trained from a library called Hugging Face’s Transformers Library and further calculate sentence and sequence level representations for each of those examples in learning/training data by making use of language model specific featurization and tokenization.
DIET-Classifier: Initially by default, Rasa provides classifiers like (Dual Intent and Entity Transformer), it's capable of handling entity extraction and intent classification both.

Rasa DIET Classifier Architecture, even though Rasa provides complete authority to the Developers and Programmers to select any classifier that they wish to have but by default the one that Rasa gives is known as a classifier called DIET Classifier. Now, as soon as the responses provided by the user to the chatbot is pre-processed into various tokens then further it is sent to DIET classifier. Later those Tokens are transformed into sparse vectors by using Featurizers, as it is introduced in the Rasa NLU Pipeline and then further, we get a dense numeric vector of those input tokens only if we wish to use pre-trained word embeddings like ConVert, Bert.

Glove. Taking our project into consideration, we haven't used any word embeddings which are pre-trained. We have passed sparse vectors onto a Feed Forward Layer. What Rasa does is it keeps these Feed Forward Layers sparse from the very beginning and then further drops 80% of connections. Technically Speaking, these Feed-Forward layers distribute the same weights and further they are kept sparse. This operation is applied on all those input tokens by DIET classifier. So, this way, Rasa gives an outline summary of input tokens. Next, for summarizing the complete input sentence given by the user, Rasa uses a token called "__cls__". So here, as we can see we have a Sparse Features block, what this block will do is, it takes all the Sparse Features of the input tokens as a total sum. Then there comes another block called a pre-trained Embeddings block, this will take sentence embedding into consideration if we wish to use BERT. The return product of this is the summarization vector of user input sentence which is then forwarded onto an embedding layer which is the layer used for prediction.

Conclusion

DASS score prediction helped us get better insight into levels of anxiety and stress and factors responsible for them using machine learning and data analysis.

References

[1] Khan, S. I., Islam, A., Hossen, A., Zahangir, T. I., & Latiful Hoque, A. S. M. (2018). Supporting the Treatment of Mental Diseases using Data Mining. 2018 International Conference on Innovations in Science, Engineering and Technology (ICISET). doi: 10.1109/iciset.2018.8745591.IEEE. [2] R. Rajkumar, Velappa Ganapathy Bio-inspiring learning style chatbot inventory using brain computing interface to increase the efficiency of Users. IEEE. [3] Falguni Patel, Riya Thakore, Ishita Nandwani, Santosh Kumar Bharti Combating Depression in Students using an Intelligent ChatBot:A [4] Cognitive Behavioral Therapy. IEEE. [5] M. Srividya, S. Mohanavalli, N. Bhalaji.” Behavioral Modeling for Mental Health using Machine LearningAlgorithms” Springer 2018. [6] Paulina Morillo, Holger Ortega, Diana Chauca , Julio Proaño , Diego Vallejo-Huanga, and María Cazares. “Psycho Web: A Machine Learning Platform for the Diagnosis and Classification of Mental Disorders” Springer 2019. [7] H.-Y. Shum, X.-d. He, and D. Li, “From Eliza to xiaoice: challenges and opportunities with social chatbots,” Frontiers of Information Technology & Electronic Engineering, vol. 19, no. 1, pp. 10–26, 2018. [8] I. J. Ribeiro, R. Pereira, I. V. Freire, B. G. de Oliveira, C. A. Casotti, and E. N. Boery, “Stress and quality of life among university students: A systematic literature review,” Health Professions Education, vol. 4, no. 2, pp. 70–77, 2018. [9] B. Sharma, H. Puri, and D. Rawat, “Digital psychiatry-curbing depression using therapy chatbot and depression analysis,” in 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), pp. 627–631, IEEE, 2018. [10] K. Chung and R. C. Park, “Chatbot-based healthcare service with a knowledge base for cloud computing,” Cluster Computing, pp. 1–13, 2018. [11] B. Inkster, S. Sarda, and V. Subramanian, “An empathy-driven, conversational artificial intelligence agent (wysa) for digital mental well-being: real-world data evaluation mixed-methods study,” JMIR mHealth and uHealth, vol. 6, no. 11, p. e12106, 2018. [12] JennifferWeigel. (2018). Learning Styles: Each Brain Absorbs Information Differently. Accessed: Oct. 20, 2019. [Online]. Available: http://chicago.suntimes.com/2018/12/4/1845896 6/learning-styles-eachbrain-absorbs-information-differently. [13] M. Samarakou and G. A. T. Papadakis, ``An e-learning system for extracting text Comprehension and learning style characteristics,\'\' Educ. Technol. Soc., vol. 21, vol. 1, pp. 126136, 2018. [14] F. Colace, M. D. Santo, M. Lombardi, F. Pascale, A. Pietrosanto, and S. Lemma, ``Chatbot for E-learning: A case of study,\'\' Int. J. Mech. Eng. Robot. Res., vol. 7, no. 5, pp. 528533, 2018. [15] K. Chung and R. C. Park, “Chatbot-based healthcare service with a knowledge base for cloud computing,” Cluster Computing, pp. 1–13, 2018. [16] H. Dong, J. Mao, T. Lin, C. Wang, L. Li, and D. Zhou, ``Neural logic machines,\'\' in Proc. Int. Conf. Learn. Represent., New Orleans, LA, USA, 2019, pp. 122. [17] M. Schlichtkrull, T. N. Kipf, P. Bloem, R. van den Berg, I. Titov, and M. Welling, ``Modeling relational data with graph convolutional networks,\'\' in Proc. Eur. Semantic Web Conf. (ESWC), 2018, pp. 593607. [18] BrainWave Visualizer: Mobile Application. Accessed: Dec. 9, 2019. [Online]. Available: https://store.neurosky.com/products/brainwavevisualizer [19] S. A. Valizadeh, F. Liem, S. Mérillat, J. Hänggi, and L. Jäncke, ``Identification of individual subjects on the basis of their brain anatomical features,\'\' Sci. Rep., vol. 8, no. 1, pp. 18, Dec.2018, doi: 10.1038/s41598-018-23696-6. [20] J. Van Doren, H. Heinrich, M. Bezold, N. Reuter, O. Kratz, S. Horndasch, M. Berking, T. Ros, H. Gevensleben, G. H. Moll, and P. Studer, ``Theta/beta neurofeedback in children with ADHD: Feasibility of a short-term setting and plasticity effects,\'\' Int. J. Psychophysiol., vol.112, pp. 8088, Feb. 2017. [21] M. Samarakou and G. A. T. Papadakis, ``A e-learning system for extracting text comprehension and learning style characteristics,\'\' Educ. Technol. Soc., vol. 21, vol. 1, pp. 126136, 2018.

Copyright

Copyright © 2022 Associate Prof. Dr Bharati .P. Vasgi, Chetan Urkudkar, Isha Ghaisas, Tanvi Karale, Padmaja Lole. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET46791

Publish Date : 2022-09-16

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here