Twitter produces a massive amount of data due to its popularity that is one of the reasons underlying big data problems. One of those problems is the classification of tweets due to use of sophisticated and complex language, which makes the current tools in- sufficient. We present our framework HTwitt, built on top of the Hadoop ecosystem, which consists of a MapReduce algorithm and a set of machine learning techniques embedded within a big data analytics platform to efficiently address the following problems.
Introduction
I. INTRODUCTION
Nowadays, people from all around the world use social media sites to share information. Twitter for example is a platform in which users send, read posts known as ‘tweets’ and interact with different communities. Users share their daily lives, post their opinions on everything such as brands and places. Companies can benefit from this massive platform by collecting data related to opinions on them. The aim of this system is to present a model that can perform sentiment analysis of real data collected from Twitter. Data in Twitter is highly unstructured which makes it difficult to analyse. Online social media such as Twitter, Facebook, and Instagram allow users to communicate with the whole world. Write their own opinions about products or share their moments, even influence politics and companies. Twitter for example, al- most every huge company has an account on Twitter to know about their customers' feedback about their services or products. Sentiment analysis, known as opinion mining, for classifying specific words.
II. LITERATURE SURVEY
III. RELATED WORK
Analysis is a technique widely used in text mining. Twitter Sentiment Analysis, therefore means, using advanced text mining techniques to analyse the sentiment of the text (here, tweet) in the form of positive, negative and neutral.
The framework will accept both structure and unstructured data. And so, with machine learning, this method can improve the accuracy of prediction. The limitations faced are the lack of information and research about the symptoms of disease.
Exporting the dataset into the desirable format is one more setback we have to face. The proper user input query must be checked before dealing with processing.
IV. ARCHITECTURAL DIAGRAM
Conclusion
“TWITTER DATA VISUALIZATION \'\' is presented that is implemented using simple programming concepts of Python and JavaScript. Twitter Analyzer is capable of finding out the top ten trending hashtags and users at any given point in time and plotting them against their frequency using a bar graph. The model explained here can be extended to improve user experience, provide additional functionalities and optimize processing power. Machine Learning techniques are simpler and efficient than Symbolic techniques. These techniques can be applied for twitter sentiment analysis. Classification accuracy of the feature vector is tested using different classifiers like Naive Bayes, SVM, Maximum Entropy and Ensemble classifiers. All these classifiers have almost similar accuracy for the new feature vector.
References
[1] Anukarsh G Prasad, Sanjana S, Skanda M Bhat, B S Harish “Sentiment Analysis for Sarcasm Detection on Streaming Short Text Data”, 2nd International Conference on Knowledge Engineering and Applications, IEEE, 2018.
[2] Savan K. Patel,Jigna B. Prajapati “A Study on Developing Effective Option Trading Strategy On Nifty Index in National Stock Exchange using Data Mining ” International Research Journal of Engineering and Technology (IRJET), volume 04, issue 10, pages 201-204.
[3] ParasDharwal, Tanupriya Choudhury, Rajat Mittal, Praveen Kumar, “Automatic Sarcasm Detection using Feature Selection”, International Conference on Applied and Theoretical Computing and Communication Technology, IEEE.
[4] Sindhu. C, G. Vaidhu, Mandala Vishal Rao, “A Comprehensive Study on Sarcasm Detection Techniques in Sentiment Analysis”, International Journal of Pure and Applied Mathematics, volume 118, pages 433-442, 2018.
[5] Tanya Jain, NileshAgrawal, GarimaGoyal, Niyati Aggrawal, “Sarcasm Detection of Tweets: A Comparative Study”, Tenth International Conference on Contemporary Computing (IC3), IEEE, August 2017.
[6] Levy, M. (2016). Playing with Twitter Data. [Blog] R-bloggers. Available at: https://www.r-bloggers.com/playing-with-twitter-data/ [Accessed 7 Feb. 2018].
[7] Popularity Analysis for Saudi Telecom Companies Based on Twitter Data. (2013). Research Journal of Applied Sciences, Engineering and Technology. [online] Available at: http://maxwellsci.com/print/rjaset/v6-4676-4680.pdf [Accessed 1 Feb. 2018].
[8] Zhao, Y. (2016). Twitter Data Analysis with R – Text Mining and Social Net- work Analysis. [online] University of Canberra, p.40. Available at: https://paulvanderlaken.files.wordpress.com/2017/08/rdataminingslides-twitter- analysis.pdf [Accessed 7 Feb. 2018].
[9] Alrubaiee, H., Qiu, R., Alomar, K. and Li, D . Sentiment Analysis of Arabic Tweets in e-Learning. Journal of Computer Science. [online] Available at: http://thescipub.com/PDF/jcssp.2016.553.563.pdf [Accessed 7 Feb. 2018].
[10] Qamar, A., Alsuhibany, S. and Ahmed, S. (2018). Sentiment Classification of Twitter Data Belonging to Saudi Arabia Telecommunication Companies. (IJACSA) International Journal of Advanced Computer Science and Applications, [online] 8. Available https://thesai.org/Downloads/Volume8No1/Paper 50- Sentiment Classification of Twitter Data Belonging.pdf [Accessed 1 Feb. 2018].
[11] R. M. Duwairi and I.Qarqaz, “A framework for Arabic sentiment analysis using supervised classification” , Int. J. Data Mining, Modeling and Management, Vol. 8, No. 4, pp.369-381.
[12] Hossam S. Ibrahim, Sherif M. Abdou, Mervat Gheith, “Sentiment Analysis for Modern Standard Arabic and Colloquial”, International Journal on Natural Language Computing (IJNLC), Vol. 4, No.2, pp. 95-109.
[13] Anukarsh G Prasad, Sanjana S, Skanda M Bhat, B S Harish “Sentiment Analysis for Sarcasm Detection on Streaming Short Text Data”, 2nd International Conference on Knowledge Engineering and Applications, IEEE.
[14] Sana Parveen, Sachin N. Deshmukh, “Opinion Mining in Twitter – Sarcasm Detection” International Research Journal of Engineering and Technology (IRJET), volume 04, issue 10, pages 201-204.
[15] Paras Dharwal, Tanupriya Choudary, Rajat Mittal, Praveen Kumar, “Automatic Sarcasm Detection using Feature Selection”, International Conference on Applied and Theoretical Computing and Communication Technology, IEEE.
[16] Sindhu. C, G. Vaidhu, Mandala Vishal Rao, “A Comprehensive Study on Sarcasm Detection Techniques in Sentiment Analysis”, International Journal of Pure and Applied Mathematics, volume 118, pages 433-442, 2018.
[17] Anusha, K. S., and A. D. Radhika. \"A Survey on Analysis of Twitter Opinion Mining Using Sentiment Analysis.\".
[18] K. Abdalgader and A. Al Shibli, ‘‘Context expansion approach for graph-based word sense disambiguation,’’ Expert Syst. Appl., vol. 168, Apr. 2021, Art. no. 114313.
[19] A. Onan and M. A. Tocoglu, ‘‘A term weighted neural language model and stacked bidirectional LSTM based framework for sarcasm identification,’’ IEEE Access, vol. 9, pp. 7701–7722, 2021.