Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Vishal Jain, Mahesh Parmar
DOI Link: https://doi.org/10.22214/ijraset.2022.47337
Certificate: View Certificate
Social media sites have become vital instruments for spreading ones personal emotions to a rest of the world, thanks to the rapid growth of the Internet. Writing, photos, audio, and video are all used by many people to express their opinions or points of view. Every second, a massive amount of unstructured data are generated on the Internet as a consequence of social networking sites. To understand human psychology, data must be processed as soon as it is generated, which can be done via sentiment classification, which detects polarity in texts. It establishes if the writer has a negative, positive, or neutral attitude forward towards a specific item, administration, person, or region. Sentiment analysis is insufficient in some cases, needing emotion detection, which appropriately measures a person\'s emotional/mental state. This review investigates sentiment analysis levels, different emotion models, and sentiment analysis techniques such as emotion detection. Furthermore, the issues faced during sentiment or emotion analysis are addressed in this study. A range of machine learning techniques methods to assessing sentiment were also explored.
I. INTRODUCTION
Emotions have a significant impact on a person's decision-making processes in a variety of situations. Social media platforms such as Instagram, YouTube, Twitter, and Facebook are often used in the business sector to promote products and get customer feedback [1]. Consumers who wish to learn so much about a service or product before making a purchase might benefit greatly from the active feedback provided by the general public. If you're a marketer, you may use sentiment analysis to better understand your customers so that you can improve your goods or services accordingly. Business and customer sentiment may have an influence on the stock market in both developed and developing countries [2]. With the growth of social media, investors may now communicate with one other more quickly and easily. Because of this, investor attitude has an effect on their investment choices, which may spread and amplify quickly throughout the network and affect the stock market. Sentiment and emotion analysis has had a profound impact on the way we do business [3].
Emotion and sentiment analysis have a broad range of uses and may be carried out in a variety of ways. Deep learning, machine learning, and lexicon-based methods are the three main approaches to sentiment and emotion analysis. There are advantages and disadvantages to each. However, researchers confront substantial hurdles, such as context, mocking and many emotions in a single utterance, as well as ambiguity in lexical and syntactical terms. There are no standard standards for transmitting sentiments across numerous media, and as a result, some people express their feelings with amazing force, while others restrain them, and yet others frame their message rationally. As a result, researchers have a major problem in developing a strategy that can operate well in all areas. [4]. Emotion analysis is concerned with the automated extraction of emotions represented in text supplied by a user. According to Ekman [5] the primary human emotions are anger, disgust, fear, surprise, sadness, and joy.
The remainder of the text is organised as follows: Section 2 provides a quick overview of emotion analysis. Sentiment analysis is presented in Section 3. In addition, Section 4 discusses machine learning techniques. Section 5 describes numerous deep learning techniques, followed by a related study in emotion or sentiment analysis utilising various strategies in Section 6, and lastly, the review paper is concluded in Section 7.
II. EMOTION ANALYSIS
For affective computing, emotional analysis is essential. Emotions may be described as "affect," and the verb "to compute" implies to compute or quantify such feelings. In order to understand the human-machine interactions, we need to create devices or systems that can process and identify, interpret, and replicate human emotions. Text, speech, facial expressions, etc. are examples of this data. We can evaluate the well-being of a community, we can prevent suicides, and it may be extremely useful for enterprises to gauge the level of happiness of their consumers by studying the comments or feedback they make via the use of sentiment analysis. As a consequence of the sentiment and emotion analysis, we may also utilise the text collected from e-learning environments to conduct opinion mining for corporate organisations. A wide variety of applications, including social assistance, assessing the wellbeing of a community, and even the identification and treatment of suicidal inclinations, are among the many reasons why researchers find the detection of emotions from text to be a very fascinating topic. There are many levels of analysis that may be performed, including document, sentence, word, as well as aspect levels. Figure2 shows the procedures involved in analysing input data for emotional content [6].
A. Pre-processing
People like to express their sentiments and emotions in an uncomplicated manner on social media. Consequently, the data gathered from these social media platforms' posts and audits is largely unstructured, making sentiment as well as emotion analysis by robots problematic. Consequently, data pre-processing is an essential step in data cleaning since numerous procedures that follow data pre-processing are greatly affected by data quality. [7].
B. Feature Extraction
Text is interpreted numerically by the computer. The technique of translating or mapping language or words to genuine vectors is known as word template matching or word embedding. The feature map, also known as a matrix, is created by breaking down a book into phrases, that are then broken further into words. For each row in the feature matrix, each word in the dictionary is represented by a unique feature column, and each feature cell contains a count of the term's occurrences inside a phrase or text. [8].
C. Classification (Machine and deep learning)
We can used mainly based on: lexicon based, machine learning based, and deep learning-based algorithms that described below[4]:
III. SENTIMENT ANALYSIS
A wide variety of applications make use of sentiment analysis for purposes including recommendation as well as feedback analysis. Sentiment analysis, sometimes known as SA, is a subfield in text mining that is actively being researched. SA refers to a computational approach to the handling of views, feelings, and the subjectivity of language. Many disciplines of study, including psychology and neuroscience, use sentiment analysis since it is a key component of human behaviour. Many computer scientists are interested in this because of its broad variety of applications, including social support, community evaluation, and even the avoidance of suicidal ideation. As a result, models may be used to evaluate social media data and get insights from the general public about a product or a subject. Various suicide prevention and e-learning systems may also benefit from sentiment analysis. We chose to do a review of existing methods for detecting emotions in text and make it accessible to the scientific community since we were excited about its great potential.
An example of how sentiment analysis works is shown in Figure 3. Once the machine learning model has been trained by utilising relevant data, it may be used for sentiment categorization. There are other dictionary-based methods that don't need the model to be trained.
A. Sentiment Analysis Levels
On a number of levels, sentiment analysis has been studied:
IV. CHALLENGES IN SENTIMENT ANALYSIS AND EMOTION ANALYSIS
People nowadays produce a great deal of data in the form of unstructured text because of the prevalence of the Internet. The use of poor syntax, misspelt words, and emerging slang are only some of the problems that might arise while using social networking sites, as seen in Figure 5. The presence of these obstacles makes it difficult for robots to conduct analysis of mood and emotion. There are instances when people have trouble articulating their feelings in a straightforward manner. For example, in the question "Why have you been sooooo late?" the word "why" is misspelt as "y," the word "you" is misspelt as "u," and the word "soooo" is added to emphasise the question's tone. In addition, it is unclear from this statement whether the individual being described is upset or anxious. Consequently, detecting sentiment and emotions from real-world data is fraught with difficulties for a number of different reasons [12].
The limited availability of resources is one of the obstacles that must be overcome in the emotion identification and sentiment analysis processes. Some statistical techniques, for instance, call for a very big dataset that has been annotated. The collection of data, on the other hand, is not very challenging; nonetheless, the human labelling of the massive dataset is highly time-consuming and less trustworthy [13]. Another issue that arises when it comes to materials is the fact that the majority of them are only English is a language that is easily accessible. As a result, undertaking sentiment analysis and emotion recognition in languages other than English, wide swath languages, is a considerable challenge as well as a potentially rewarding opportunity for academics. In addition, some of the lexicons and corpora are only applicable to a single domain, which makes it difficult to reuse them in other areas.
V. MULTI-TASK ENSEMBLE FRAMEWORK
Algorithms that use ensemble learning to integrate the predictions of many different base learners are called ensemble learning algorithms. Many researchers in machine learning literature have focused on the construction of excellent ensembles during the last decade since the generalization ability is substantially superior than that of a single learner. An ensemble classifier typically involves two steps: the creation of numerous classifiers and the merging of those predictions. As a general rule, the component classifiers must be accurate and diverse in order to form an effective ensemble[14]. Multitask ensemble framework which learns numerous related tasks simultaneously. Multiple models' learnt representations are put to good use in the ensemble model.
Using the interconnectedness of numerous issues and activities, the multi-task learning framework aims to accomplish generalisation. While two or more tasks are connected, it is hypothesised that the jointmodel may benefit from the shared representations when learning in a multi-task environment. When compared to a single-task framework, the multi-task framework has three major advantages: Improves generalisation; (2) improves task performance via shared representations; and (3) decreases model complexity in terms pf learnable parameters by employing a single unified model rather than distinct models for each task [15].
In multi-task learning, machine learning models are trained concurrently with data from many tasks, utilising shared representations to understand the common concepts across a group of related tasks. It is hoped that these common interpretations would help relieve the well-known drawbacks of deep learning: large-scale data demands and processing strain. Obtaining such results has been difficult and is still under investigation today. Machine learning applications are often restricted to the training data (input characteristics as well as class attributes), therefore we are not provided any more information about other relevant activities. Using the MTL paradigm to increase generalization capability requires a thorough understanding of how to generate related tasks from the provided data. Some traits that the attribute selection method discards might be useful as additional related tasks for the transmission of inductive bias[16].
VI. MACHINE LEARNING TECHNIQUES
Artificial intelligence (AI) includes a subfield known as machine learning (ML). We can teach programmes to learn from experience in the same manner that people do by using a technique called machine learning. These programmes learn from their experiences, expand, and adapt as more data is input into them. This is accomplished by using algorithms that, via an iterative process, learn from data. Applications may react to different types of input by using pattern recognition. The capacity of an application to respond to new input by leveraging repetitions of previous responses is an example of machine learning. Training data are the instances of connections between input data and outputs that are used by machine learning algorithms to teach them how to anticipate outputs based on past examples of such relationships. It is possible to progressively develop a model of the connection between inputs and outputs by evaluating its predictions and making adjustments when they are shown to be incorrect. The process of identifying patterns in data via the use of computer programmes is known as machine learning. It is a method for producing something that is analogous to the line of best fit. When dealing with data that is both complicated and abundant in characteristics, it is beneficial to automate this procedure [17].
A. Machine Learning-based Approaches
For the purpose of classifying feelings, machine learning methods are often used. Statistical analysis is carried out using machine learning (ML) approaches almost totally and to a significant extent, according to the definition of a machine learning methodology.
VII. DEEP LEARNING
The discipline of machine learning and pattern recognition has a new frontier in the form of deep learning. Classification in deep architectures is made possible by the use of supervised or unsupervised learning methods in combination with deep architectures. Neural networks can be used to swiftly separate the various explanatory components in data in order to find more abstract qualities in higher levels of the representation. Its cutting-edge performance in a multitude of disciplines, including object perception, speech recognition, computer vision, collaborative filtering, and natural language processing, has earned it the title of "Best in Class" (NLP),it has recently received great attention. It is based on the idea that the human brain has several representations, including lower level characteristics and higher-level abstractions. In our minds, concepts are arranged in a hierarchical fashion. Learn elementary ideas before building on them to create more abstract ones. There are numerous layers of neurons in the brain that operate as feature detectors that get more abstract as the levels increase. The machines can more easily generalise this more abstract style of information representation. [18].
A. Deep Learning Techniques
A wide range of learning problems, including those stated above, may be successfully tackled using deep learning models. An automated strategy is used by deep learning algorithms to get attributes that are exclusive to them. [19].
VIII. LITERATURE REVIEW
Many researchers have focused on these subjects and have generated substantial results. These results are important in their respective sectors, since they enable to comprehend the general summary in a short time.
(Akhtar et al., 2022) develop a multi-task ensemble learning system that can simultaneously solve multiple interconnected issues. The purpose of the ensemble model is to make accurate predictions by combining the learned representations of three different deep learning models (namely, CNN, LSTM, and GRU) with a feature representation that was constructed by hand. These include “emotion categorization & intensity”, “valence, arousal & dominance for emotion”, and “valence, arousal as well as dominance for sentiment” using a multi-task framework. The fundamental conditions include a wide variety of domains and may be broken down into two distinct granularities, namely coarse-grained as well as fine-grained (i.e., tweets, Facebook posts, news headlines, blogs, letters etc.). According to the findings of the trials, it seems that the multi-task structure that was suggested is superior to the single-task frameworks in every research.[15].
(Aslam et al., 2022) The research employs cryptocurrency-related tweets for sentiment analysis & emotion recognition, which are widely used for projecting bitcoin market prices. As a method of boosting the speed of the analysis, the LSTM-GRU deep learning ensembles model was constructed. The LSTM & GRU models are integrated, with the GRU benefiting from the LSTM model's properties. The performance overall of machine learning appears to be greatly improved when BoW features are included, according to the findings.
The suggested LSTM-GRU ensemble outperforms both machine learning or state-of-the-art models, with an efficiency of 0.99 for sentiment and 0.92 for emotion prediction. Both of these outcomes are outstanding[20].
(Polonijo et al., 2021) The goal of this research is to present a deep learning method for combining sentiment ratings with Word2Vec vectors. As a result, a representation that really is sensitive to sentiment and includes both emotional as semantic information will emerge. These two factors, especially when combined, will result in a more accurate propagandist classification model. Integrating the Word embeddings vector with sentiment analysis results allows this method to keep the Word2Vec vector's flexibility. Tests were conducted using a Word2Vec network without sentiment information as well as sentiment data combined with typical deep learning approaches for propaganda identification. The results showed that using a hybrid strategy enhanced propaganda categorization accuracy[21].
(Mohana et al., 2021) This study made use of the Kaggle dataset, which had previously been crawled and sorted into either positive or negative categories. The data must be processed and translated into a standard format before it can be used. The data may include emoticons, usernames, and hash tags. The proposed study project should also include the extraction of pertinent parts from the text, exactly as unigrams and bigrams, which are two distinct methods of expressing the word "tweet." Researchers combine a large number of different classifiers in order to improve the accuracy of their predictions using a meta learning algorithm approach called "assembling." In conclusion, the findings of the research demonstrate that Deep Learning techniques are superior than other methodologies.[22].
(Aziz & Dimililer, 2020) study presents a framework for doing sentiment analysis using an ensemble of classifiers. The suggested weighted majority voting ensemble technique forms a single classifier by combining six different models into a single algorithm. These models are: Naive Bayes, Logistic Regression, Stochastic Gradient Descent, Random Forest, and Decision Tree are all examples of statistical techniques. Also included is a Support Vector Machine. Each individual classifier's accuracy, or Fl-score, is utilised to calculate the weight that's also assigned to the that classification in the ensembles. Instead of using weighted majority voting, this system incorporates models that use basic majority voting. Furthermore, a comparison is done between these six different classifiers is order to see how well they operate. The proposed ensemble model is tested on several collected from previous sentiment datasets, including those from SemEval 2017 Tasks 4A, 4B, and 4C. The Logit classifier is superior to the others, according to the findings, because it considers more data than the others. Furthermore, the recommended ensemble weighed majority voting classifier comprising the six separate classifiers outperforms both simple majority voting and all independent classifiers[23].
(Rathi et al., 2018) the findings of the study demonstrated that existing machine learning algorithms were unable to provide superior outcomes in sentiment categorization. Researchers are applying ensemble machine learning methods to increase the efficiency and reliability of the suggested strategy in order to enhance classification results in the sentiment analysis area. Similarly, we are combining the SVM with the DT in order to get superior classification performance in terms of f-measure as well as accuracy.[24].
Table 1: Various existing model performance
No. |
Author/year |
Method |
Dataset |
Accuracy |
[1] |
Md Shad Akhtar (2021)[15] |
CNN, LSTM and GRU |
WASSA2017, SemEval-2016, |
89% |
[2] |
Krishna Kumar Mohbey (2021)[25] |
LSTM |
Amazon Review Dataset |
93.66% |
[3] |
Rifqi Majid (2021) [26] |
RNN |
Cornell movies |
precision accuracy of 0.76. |
[4] |
Fengdong Sun (2020)[27] |
LSTM |
Hotel Reviews Dataset |
95%. |
[5] |
Qi Wang (2019)[28] |
RNN, LSTM and GRU |
Large Movie Review Dataset |
89% |
[6] |
Benwang Sun(2018)[29] |
SVM, CNN and CNN-LSTM |
Tibetan micro-blog dataset |
74%, 71% 86% |
Consumers can now submit feedback in the form of reviews, ratings, and comments on websites that are either commercial or social in nature, thanks to the advancement of the internet. Within the field of text processing, the study of sentiment and emotion is an active topic of research. The overall purpose of this study is to classify reviews automatically. The goal of this research is to provide an overview of the many methodologies for recognising emotions and sentiments that are now available. This paper presents the findings of a survey that looked into the methodologies and resources used to analyse sentiment and emotion. We go over the various approaches, categories, and challenges involved. At the same time, it\'s important to remember that traditional methods like lexicon-based approaches, machine learning algorithms, and deep learning approaches are all in the process of being enhanced. Furthermore, pre-processing and feature extraction procedures can have a considerable impact on the performance of many sentiment and emotion analysis approaches. In addition, we placed a high priority on reviewing and summarising important material. The findings of this poll can be used to gain a better knowledge of the difficulties that lie ahead and to define the direction that sentiment and emotion analysis research should go in the future.
[1] I. E. Agbehadji and A. Ijabadeniyi, “Approach to Sentiment Analysis and Business Communication on Social Media,” 2021. [2] H. J. Jang, J. Sim, Y. Lee, and O. Kwon, “Deep sentiment analysis: Mining the causality between personality-value- attitude for analyzing business ads in social media,” Expert Syst. Appl., 2013, doi: 10.1016/j.eswa.2013.06.069. [3] A. Bhardwaj, Y. Narayan, Vanraj, Pawan, and M. Dutta, “Sentiment Analysis for Indian Stock Market Prediction Using Sensex and Nifty,” 2015, doi: 10.1016/j.procs.2015.10.043. [4] P. Nandwani and R. Verma, “A review on sentiment analysis and emotion detection from text,” Social Network Analysis and Mining. 2021, doi: 10.1007/s13278-021-00776-6. [5] P. Ekman, “An Argument for Basic Emotions,” Cogn. Emot., 1992, doi: 10.1080/02699939208411068. [6] N. M. Hakak, M. Mohd, M. Kirmani, and M. Mohd, “Emotion analysis: A survey,” 2017, doi: 10.1109/COMPTELIX.2017.8004002. [7] A. Abdi, S. M. Shamsuddin, S. Hasan, and J. Piran, “Deep learning-based sentiment classification of evaluative text based on Multi-feature fusion,” Inf. Process. Manag., 2019, doi: 10.1016/j.ipm.2019.02.018. [8] A. Bandhakavi, N. Wiratunga, D. Padmanabhan, and S. Massie, “Lexicon based feature extraction for emotion text classification,” Pattern Recognit. Lett., 2017, doi: 10.1016/j.patrec.2016.12.009. [9] J. Zhu, C. Xu, and H. S. Wang, “Sentiment classification using the theory of ANNs,” J. China Univ. Posts Telecommun., 2010, doi: 10.1016/S1005-8885(09)60606-3. [10] A. Gupta and J. Pruthi, “Survey on Sentiment Analysis for Twitter,” vol. 8, no. 03, pp. 51–60, 2017. [11] M. Wankhade, A. C. S. Rao, and C. Kulkarni, “A survey on sentiment analysis methods, applications, and challenges,” Artif. Intell. Rev., 2022, doi: 10.1007/s10462-022-10144-1. [12] E. Batbaatar, M. Li, and K. H. Ryu, “Semantic-Emotion Neural Network for Emotion Recognition from Text,” IEEE Access, 2019, doi: 10.1109/ACCESS.2019.2934529. [13] A. Balahur and M. Turchi, “Comparative experiments using supervised learning and machine translation for multilingual sentiment analysis,” Comput. Speech Lang., 2014, doi: 10.1016/j.csl.2013.03.004. [14] Q. Wang and L. Zhang, “Ensemble learning based on multi-task class labels,” 2010, doi: 10.1007/978-3-642-13672-6_44. [15] M. S. Akhtar, D. Ghosal, A. Ekbal, P. Bhattacharyya, and S. Kurohashi, “All-in-One: Emotion, Sentiment and Intensity Prediction Using a Multi-Task Ensemble Framework,” IEEE Trans. Affect. Comput., 2022, doi: 10.1109/TAFFC.2019.2926724. [16] S. Vandenhende, S. Georgoulis, W. Van Gansbeke, M. Proesmans, D. Dai, and L. Van Gool, “Multi-Task Learning for Dense Prediction Tasks: A Survey,” IEEE Trans. Pattern Anal. Mach. Intell., 2021, doi: 10.1109/TPAMI.2021.3054719. [17] V. Jain, A. Kulkarni, Y. B. T. Integrated, C. Engineering, and N. Mpstme, “Survey on Various Algorithms of Machine Learning and its Applications,” 2020. [18] V. Pream Sudha and R. Kowsalya, “a Survey on Deep Learning Techniques, Applications and Challenges,” Int. J. Adv. Res. Sci. Eng. IJARSE, vol. 8354, no. 4, p. 3, 2015, [Online]. Available: http://www.ijarse.com. [19] Y. K. Bhatti, A. Jamil, N. Nida, M. H. Yousaf, S. Viriri, and S. A. Velastin, “Facial Expression Recognition of Instructor Using Deep Features and Extreme Learning Machine,” Comput. Intell. Neurosci., 2021, doi: 10.1155/2021/5570870. [20] N. Aslam, F. Rustam, E. Lee, P. B. Washington, and I. Ashraf, “Sentiment Analysis and Emotion Detection on Cryptocurrency Related Tweets Using Ensemble LSTM-GRU Model,” IEEE Access, vol. 10, pp. 39313–39324, 2022, doi: 10.1109/ACCESS.2022.3165621. [21] B. Polonijo, S. Suman, and I. Simac, “Propaganda Detection Using Sentiment Aware Ensemble Deep Learning,” 2021, doi: 10.23919/MIPRO52101.2021.9596654. [22] R. S. Mohana, S. Kalaiselvi, K. Kousalya, P. Mohamed Hanif, D. Lohappriya, and K. Khalid Ali Khan, “Twitter based sentiment analysis to predict public emotions using machine learning algorithms,” 2021, doi: 10.1109/ICIRCA51532.2021.9544817. [23] R. H. H. Aziz and N. Dimililer, “Twitter Sentiment Analysis using an Ensemble Weighted Majority Vote Classifier,” 2020, doi: 10.1109/ICOASE51841.2020.9436590. [24] M. Rathi, A. Malik, D. Varshney, R. Sharma, and S. Mendiratta, “Sentiment Analysis of Tweets Using Machine Learning Approach,” 2018, doi: 10.1109/IC3.2018.8530517. [25] K. K. Mohbey, “Sentiment analysis for product rating using a deep learning approach,” 2021, doi: 10.1109/ICAIS50930.2021.9395802. [26] R. Majid and H. A. Santoso, “Conversations Sentiment and Intent Categorization Using Context RNN for Emotion Recognition,” 2021, doi: 10.1109/ICACCS51430.2021.9441740. [27] F. Sun, N. Chu, and X. Du, “Sentiment analysis of hotel reviews based on deep leaning,” 2020, doi: 10.1109/ICRIS52159.2020.00158. [28] Q. Wang, L. Sun, and Z. Chen, “Sentiment analysis of reviews based on deep learning model,” 2019, doi: 10.1109/ICIS46139.2019.8940267. [29] B. Sun, F. Tian, and L. Liang, “Tibetan Micro-Blog Sentiment Analysis Based on Mixed Deep Learning,” 2018, doi: 10.1109/ICALIP.2018.8455328.
Copyright © 2022 Vishal Jain, Mahesh Parmar. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET47337
Publish Date : 2022-11-06
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here