Corroboration of Twitter Sentiment Analysis and Event Analysis of Indian Budget 2022 on the Bitcoin Market

Authors: Abhinand G, Dr. V. Uma Maheswari

DOI Link: https://doi.org/10.22214/ijraset.2022.43533

Abstract

This study aims to understand the corroboration of the results of Sentimental analysis with Event Analysis on the Indian Budget 2022 announcement on cryptocurrency. Sentimental analysis tries classify the tweets on Bitcoin on the budget day into Positive or Negative and deciphers the overall sentiment of the bitcoin investors on the Budget Day and the subsequent 2 days. This has been made possible by using a supervised machine learning algorithm. Event Analysis identi- fies whether any abnormal returns are seen on the event day and the subsequent 10 days. This study observes that the range between the positive and negative sentiments is minimal and there is no abnormal returns found post the budget announcement in the event study. It may be con- cluded that positive sentiments nullified the negative sentiments where the overall sentiments have a mean subjectivity score that is more leaning towards the less-opinionated side, which may be accounted for the lack of abnormality was found on the event day as well as the adjustment period.

Introduction

I. INTRODUCTION

Bitcoin has the highest market capitalisation in the cryptocurrency market Bitcoin is widely considered as the method of payment across countries and also by blue chip companies [1]. Bitcoin is the most active and the oldest in cryptocurrencies and other cryptocurrencies are viewed as the financial assets and not as money like bitcoin [2]. Cryptocurrency in India is unregulated and 7.3% of the Indian population own cryptocurrency1 and the investors are expected to grow even faster2. The latest development in the Indian cryptocurrency market is the announcement of a 30% fixed tax rate on all income generated through crypto trading by the Indian Government during the Union Budget 2022 on 1st February 2022. "The budget provides clarity on taxation and shows the government’s intent to take a business-friendly approach while protecting the interest of consumers and the exchequer. We hope to work with the government to help bring crypto-asset taxation at par with other asset classes and participate in the central
government’s vision to promote economic growth," tweeted Mr. Ashish Singhal of Coin Switch Kuber3. This paper attempts to analyse the sentiments of tweets by users that contain the keywords "Indian Budget 2022" and "Bitcoin" in order to establish a corroboration between the polarity of the sentiments and the presence of abnormal returns in the Bitcoin market (BTC-INR). To perform the sentiment analysis, a tweet dataset has been prepared using a scraper and after further pre-processing and application of machine learning models pre-trained by us, the polarities and subjectivities of the tweets have been observed. The paper also aims to study the effect of the range between the sentiments on the BTC-INR market with the price movements in BTC-INR through the constant return model of event study. Moreover, establishing a relationship between these two entities could give us an insight into the effect of similar events on Bitcoin markets during future scenarios with similar implications.

II. RESEARCH QUESTIONS

The study aims to answer the following research questions-

RQ1: Is there any statistically significant abnormality in the BTC-INR market during the time of the Indian Budget 2022 announcement?
RQ2: How do the positive and negative sentiments pertaining to the Indian Budget 2022 and Bitcoin affect the BTC-INR market?

1https://triple-a.io/crypto-ownership/
2https://www.livemint.com/news/india/indians-are-spending-millions-daily-on-cryptocurrency-trading-11606906191740.html
3https://economictimes.indiatimes.com/markets/cryptocurrekency/articleshow/89278494.cms

The below objectives were set in order to answer the research questions-

To conduct an event study on BTC-INR to estimate the abnormal returns using Constant Mean Return Model.
To classify the sentiments of the tweets on Bitcoin pertaining to Indian Budget 2022 using pre-trained models.
To compare the results of sentiment analysis with the results of event analysis.

We present our paper in the following order – Section 2 deals with the review of literature pertaining to event study and sentiment analysis. Sections 3 and 4 discuss in detail the event study data set and the analysis. From Sections 5 to 9, we give a detailed analysis on sentimental analysis on the bitcoin tweets on cryptocurrency related announcements in the Indian Budget 2022. Section 10 deals with the discussion of the study and Section 11 deliberates on the scope for future research.

III. RELATED WORK

Makrehchi, M et. Al [3] has attempted a model estimating sentiment based on twitter posts and use the sentiment
to predict future stock market movement. Sentiments of twitter posts have a significant relationship with stock returns
and volatility [4]. Ranco G, et. Al [5] found that sentiment polarity in twitter posts impact the cumulative abnormal
return. Cryptocurrencies liquidity increases or decreases after positive and negative news announcements [6]. The
event study on news of cryptocurrency thefts on cryptocurrency prices reveal that price rises when there is a theft
news [7]. Bitcoin Investors behave rationally for positive events and a bit more irrationally for negative events [1] .

In the short-term, the strength of a trend and, hence, the price in both bullish and bearish markets is invariant to volume
changes; however, the volume is sensitive to price changes, especially for the upward trend. The detected relationship
is unidirectional, meaning that positive information from the cryptocurrency market encourages investors to make a
stronger entrance into the market, which causes bubbles and further drives price increases. Investors buy more when
there is positive information from the cryptocurrency market which drives the prices to go up [2]. Tweets of Elon Musk on cryptocurrency has a significant positive impact on abnormal returns and trading volume of dogecoin and not for bitcoin [8].

Apporv et. Al [9] made use of three types of models including the unigram model, feature based model and a tree
kernel model in order to classify tweets into positive, negative and neutral based on its sentiments. They identified
that the sentiment analysis process used for twitter data is equal to that used for other genres. They also identified that
a feature analysis is essential for the revelation that important features involve the combination of polarity of words
along with their speech tags. L. Lin et. Al [10] analysed the sentiments of retweeting comments and depicted the
turning point in sentiment when a tweet gets retweeted. They used two primary approaches for classification which
included the Support Vector Machine (SVM) and Lexicon-based method. The latter gave a higher precision and recall,
which led to the conclusion that the said method is more effective in classification of sentiments.
Parabowo et. Al [11] studied the application of sentiment analysis towards unstructured data like movie reviews
and comments on the social media application-MySpace. They made use of multiple approaches involving the Nat-
ural Language Processing (NLP) approach, Unsupervised approach and the Machine Learning approach. To use the
machine learning approach, they primarily made use of the Support Vector Machine (SVM) classifier. They also pro-
posed a novel approach in which each classification model could contribute to other models to have an increased level
of effectiveness. Medhat, W. et. Al [12] conducted a literature survey and categorised the research papers according to
the techniques used by them to conduct the sentiment analysis. They also discussed the related fields relating to senti-
ment analysis like transfer learning, building resources and emotion detection. Hussein, D. [13] conducted a literature
survey on the challenges faced by individuals conducting sentiment analysis. They also discussed the relationship
between the review structure and challenges faced in the sentiment analysis process. [14] studied and presented comparisons of eight sentiment analysis techniques. They also studied multiple methods of sentiment analysis techniques including supervised machine learning algorithms and lexical approaches. They developed a novel method that is a combination of existing methodologies to provide the best coverage results. They bserved that even with the decrease in accuracy and precision with the surge in combination of methodologies, the evaluation metrics remain in a reasonable range which is an F-score greater than 0.7. This was an indication that combination of every single methodology isn’t the best way to go to achieve high accuracies and the right combination of methods varies with the kind of data that is being dealt with. Fersini, E et. Al [15] discussed the limitation that is associated with sentiment classification, where text is considered as a unique source of information. They proposed
a novel method of Approval Network in order to enable the representation of the contagion on social networks in a
better manner. Through experiments conducted, they concluded that the sentimental analysis methodologies based
upon the proposed Approval Networks highly outperform the traditional approaches to sentiment analysis.

IV. EVENT STUDY METHODOLOGY AND DATASET

TABLE 1: INFORMATION ON DATASET FOR EVENT ANALYSIS

Event Date – Indian Budget 2022	1st February 2022
Anticipation Days: 22nd January 2022 – 31st January 2022	10 Days
Adjustment Days: 2nd February 2022 to 11th February 2022	10 Days
Estimation Window: 24th September 2021 to 21st January 2022	120 Days
Average Return of the Estimation Window	-0.076%

This study is aimed at finding out the impact of Budget Day announcement on Crypto Currency in terms of abnormal returns by using the constant mean return model. The constant mean returns model often yields results as that of sophisticated models [16] .T-test is used to identify the presence of abnormal returns. The study focuses on BTC-INR historical data downloaded from yahoo finance(https://finance.yahoo.com/quote/BTC-INR/history?p=BTC-INR). A 120-day estimation window has been utilised as a test period to estimate the variance [17]. The anticipation and adjustment period of 10 days have been taken before and after the event day as the days prior and after the event day in order to capture the price effects of the announcements [18]. The average return of the estimation window was -0.076% (Refer Table 1).

V. ANALYSIS OF DATA

Indian Budget 20222 was considered as the event day. 120 days prior to the budget day plus 10 days of anticipation
period was taken as the estimation window. Expected Return and Standard Deviation were calculated using the bitcoin
prices in the estimation window for the constant mean return model. Abnormal Returns were measured for the constant
mean model using the formula -

The cumulative average residual method (CAR) measures the abnormal performance as the sum of each month’s
average abnormal performance [19]. The CAR starting at time t1 through time t2 where horizon length L = t2 - t1 +
1 is

T-test was carried out for the Cumulative Abnormal Return to assess whether any abnormality in returns were
detected due to the budget announcement on cryptocurrency (Refer Table 2). The hypothesis of the event study is-
H0: The Indian Budget announcement on cryptocurrency did not have any impact on the BTC-INR market
Based on the T-test results of CAR for Event Day, Anticipation period and Absorption period, all the T-values
are below 1.96 indicating that there is no evidence of abnormal returns due to the Budget announcement on cryp
tocurrency. Abnormal returns (Positive or Negative) would have been evinced in the adjustment period if the bitcoin
investors have reacted in a largely polarised manner to the budget announcements. Hence, the null hypothesis is
accepted.

TABLE 2: ANALYSIS OF DATA

Standard Deviation	Standard Deviation	3.2%
	Standard Deviation (10 Days)	10%
	Standard Deviation (21 Days)	15%
Return	Event	1.009%
	Anticipation	6.533%
	Adjustment	11.860%
	Total	19.401%
T-test	Event	0.3116295
	Anticipation	0.64
	Adjustment	1.158
	Total	1.308
p-value	Event	0.7558
	Anticipation	0.5244
	Adjustment	0.2488
	Total	0.1933

Though the closing prices of the adjustment period (10 days post 01st February 2022) are slightly higher than the
anticipation period (10 days prior to 01st February 2022), the constant mean returns for the above stated period are
fluctuating. The bitcoin constant returns on 2nd and 3rd of February were slightly lower and there was a momentary moderate spike on 4th of February followed by similar lower returns in both anticipation and adjustment period. With
respect to volume of trading, the anticipation period has a larger volume of trading than adjustment period until 6th
of February 2022 post which the volume gained a momentum in the adjustment period - which was until the 6th of
February 2022 (Refer Fig. 1).

VI. SENTIMENT ANALYSIS

Sentiment analysis is a technique common in Natural Language Processing (NLP). It is primarily focused on classification of text into categories like "positive", "negative" or "neutral" [9]. This paper dwells on the study of sentiments with respect to tweets with keywords including "Budget 2022" and "Bitcoin". We have made use of machine learning based classification for sentiment analysis. Machine learning based classification is effective and practical owing to their ability to achieve a level of accuracy that can commensurate to human experts [20]. To collect a diverse set of tweet data to conduct the sentiment analysis, three different datasets are obtained from two consecutive days from the date of budget announcement, which happens to be 1st February 2022. The sentiments of the tweets from each of these days are analyzed and compared with the extent of abnormality observed in the Bitcoin market. Due to the fact that tweets are highly unstructured, there is a necessity for pre-processing the data before it is used
for analysis and learning. Figure 2 explores the steps involved in the whole sentiment analysis procedure used in this research.

As observed from Figure 2, pre-processing of the dataset by us has involved a variety of methodologies to ensure
better accuracy of the machine learning models. Firstly in order to train the models, pre-classified tweets have been
taken from the Sentiment140 dataset [21], where tweets have been labelled as 0 for having a negative sentiment and
4 for having a positive sentiment. After the dataset is imported into a python environment, various pre-processing
techniques are used. Algorithm 1 summarizes the whole data cleaning process, which involves various methods right
from removal of stop words, URLs, "RT" from tweets and also special characters. These characters and strings don’t
play an important role in the NLP process for sentiment analysis and are even referred to as "unnecessary noise" [22].
Thus, removing them leads to a better accuracy in predicting the sentiment of a tweet.

VII. FEATURE EXTRACTION

The feature extraction technique used for this dataset is the Term Frequency-Inverse Document Frequency (TF-IDF) method. TF-IDF is a very popular approach and has use-cases in multiple fields including information retrieval and text-mining. They are used in evaluation of the relationship of each word in a collection of multiple documents [23]. The TF-IDF value varies with respect to the frequency of a word in a document. It is based on two statistical methodologies-namely Term Frequency and Inverse Document Frequency. The term frequency refers to the frequency of the term in a document with respect to the total number of words in it.

When the Tf value is high, the word is meant to have a high importance in the documents [23]. Further, inverse
document frequency is a references to the rarity or frequency of a word throughout the documents. If the IDF score is
high, it is an indication that the word is rarely occurring in the documents.

VIII. USAGE OF MACHINE LEARNING MODELS

A. Support Vector Machine

The support vector machine was initially proposed by Vapnik et Al [24] as a supervised learning algorithm and was
later again introduced by Cristianini N et Al [25]. It is used for various purposes involving regression, classification
etc. Before application of the SVM model, the data needs to be vectorized and the linear SVM model object is made
use of. The main objective of the SVM algorithm is the detection of a hyperplane that distinctly classifies data-points.
Once the hyperplane has been identified, the data-points on either side of it are classified. Table 3 shows the type of
hyperplane possible with respect to the number of input features provided. In order to ensure the least possible error during classification of two types, the optimal separating surface is made use of. It is also denoted as the largest class interval. In Figure 3, the line l refers to the hyperplane, support vectors a1, a2 and a3 refer to the positive sentiments and support vectors b1 and b2 refer to the negative sentiments.

TABLE 3: TYPES OF HYPERPLANES

Number of Input Features	Hyperplane
2	Line
3	2-D Plane

B. Bernoulli Naïve Bayes

The Naive Bayes Classifier makes use of the Bayes theorem. It is based upon the primary assumption regarding the independence of a feature in comparison to other features in a class. This is more preferable in some cases compared to other supervised learning algorithms owing to its simplicity and ability to quickly train a dataset. This ultimately leads to a lower time of computation [26]. The bayes theorem allows for the calculation of posterior probabilities. [27]

In the above equation, A refers to the sentiment labels that have been associated with each tweet, B refers to the class of sentiments that have been used. In our case, there are two classes - positive and negative. P(A | B) is the Bayesian probability of when an instance A occurs in a particular class for each value of B. Although the Bayes classifier is widely used, it is generally sub-optimal for non-linearly separable concepts and can only learn linear discriminant functions, as observed in many researches [28] [29].

C. Logistic Regression

This is a model which makes use of multiple dimensions to predict and give a result for new input that has been given to it [30]. A number of independent variables are fed into the model along with their dependent counterparts to train the model. The dependent variable can either be 0 or 1, depending on the sentiment. The model also helps us analyze the effect of the independent variables on the dependent variable.

The model is analytically depicted by the following equation [31]-

As tempting as it is to include multiple input independent variables into the logistic regression model, it could lead to high standard errors [32]. Moreover, if the input variables are highly correlated, the preciseness of the logistic regression model greatly reduces [32]. These limitations must be taken into account while using the model.

IX. EVALUATION

To compare the different models, various evaluation metrics have been used. Accuracy is defined as the ratio between the number of correct predictions made by the model to the total number of predictions made by the model. Table 5 compares the accuracies of each model

In the above equation, TP(True Positive) refers to when the class of the tweet’s sentiment is positive and the model
predicts it to be positive, TN(True Negative) refers to when the class of the tweet’s sentiment is negative and the model
predicts it to be negative, FP(False Positive) refers to when the class of the tweet’s sentiment is negative and the model
predicts it to be positive and FN(False Negative) refers to when the class of the tweet’s sentiment is positive and the
model predicts it to be negative. With respect to the accuracies of the trained and tested machine learning models, the order of best models with respect to their accuracies is as follows:

Logistic Regression > Support Vector Machine > Bernoulli Naive Bayes.

Precision is an estimation of how many of the positively predicted samples are actually correct and Recall is an
estimation of the proportion of positive samples which were correctly identified. Once the Precision and Recall have been understood, the F1-score can be calculated. It is mathematically defined as the harmonic mean of Precision and Recall.

TABLE 4: EVALUATION METRICS FOR POSITIVE AND NEGATIVE TWEETS

Model	Negative Tweets			Positive Tweets
Model	Precision	Recall	F1-Score	Precision	Recall	F1-Score
SVM	0.81	0.79	0.80	0.80	0.82	0.81
BNB	0.81	0.78	0.79	0.81	0.78	0.79
LR	0.83	0.80	0.81	0.80	0.83	0.82

TABLE 5: COMPARISON OF ACCURACIES OF VARIOUS MODELS

Model	Accuracy
Support Vector Machine	80.51%
Bernoulli Naïve Bayes	79.6%
Logistic Regression	81.4%

Thus, Logistic Regression is used for predicting sentiments of the main dataset needed for this paper owing to its
highest accuracy. The evaluation metrics are for the models predicting negative and positive sentiments are given in
Table 4 and Figure 4 represents them graphically.

X. ANALYSIS OF THE SENTIMENTS

In order to observe the trend in positive and negative sentiments, the dataset has been split into three categories with
respect to the date of tweeting. The Logistic Regression model has been used separately in each of these categories
and the results have been observed. It has been almost a common trend on all three days where not much variation
between the positive and negative sentiments are observed. The polarity isn’t high enough for one sentiment to over-power the other. On February 1st 2022 -which happens to be the day of the Budget-2022 announcement, the negative tweets seem to have an edge over the positive tweets by occupying 64% of the total tweets. However, on February 2nd 2022, this gap between the positive and negative tweets appear to become narrower with the positive and negative tweets occupying 42.3% and 57.6% respectively-indicating that the overall sentiment among the twitter folk is equivocal in nature. The total number of tweets collected on each day are 470, 92 and 46 respectively-indicating that the people’s interest and opinion towards the topic is highly declining. It has been observed that the average polarity of the tweets on February 1 2022, February 2 2022 and February 3 2022 are 0.0607, 0.0684 and 0.0827 respectively - where each of these values are overwhelmingly close to zero - indicating a more neutral sentiment rather than a strong positive or a strong negative sentiment. Figure 5 graphically depicts the average polarity and subjectivity scores observed on each day. Figure 6 represents the trend in the percentage share between positive and negative tweets. Further, the lack
of a wide range between the positive and negative sentiments can further be elucidated by the mean polarity values of
the tweets on each day.

A. Subjectivity

Subjectivity Classification deals with classifying a document as to how opinionated it is [33]. It is thus, a reasonable understanding that the more subjective a document is, the larger the influence of emotions, opinions and personal views in it. The analysis onducted by us in this part of the research deals with the sentiment analysis of tweets that are related to "Indian Budget 2022" and "Bitcoin". Knowing the subjectivity of each tweet is essential as it would give us a valuable insight into the kind of tweets that are being analyzed and how influential they could be towards the BTC-INR market. The value for subjectivity lies in the range [0,1] where 0 denotes the least subjective and 1 denotes the maximum possible subjectivity. When the subjectivity score is approaching 0, it can be understood that the doc- ument is more leaning towards the factual side [34]. For our research, we have made use of the TextBlob toolkit to calculate the subjectivity scores of the tweet dataset on each observed day. It was observed that the mean subjectivity scores on February 1 2022, February 2 2022 and February 3 2022 were 0.17965, 0.21521 and 0.17329 respectively.
Figure 7 depicts the distribution of subjectivity scores on each day using a scatter plot, and a line at y=0.5 has been
taken to be the reference line, where any subjectivity below the line are considered to be towards the factual side and
subjectivity above the line are considered to be towards the opinionated side. An interesting observation from Figure 7 is that on all three days more, than half of the subjectivity scores lie below the reference line, indicating that more than half of the dataset of tweets can be considered to be leaning towards a factual structure rather than being opinionated [34]. Considering the fact that the gap between the positive and negative tweets isn’t very significant and the mean subjectivity of tweets on all three days approach a less opinionated side, it is a reasonable conclusion that this may be a plausible explanation for the lack of abnormality in the BTC-INR marketduring the recorded periods, as observed in this paper.

XI. DISCUSSION

As per the event study on bitcoin price movements (BTC-INR) during the day of the Indian budget 2022 as well as 10 days prior and post 1st February 2022, it is found out that no abnormal returns were evinced on the said period. It is similar to the result of Ante L[8] on their study on the impact of Elon Musk’s twitter activity on cryptocurrency, where the results revealed that price effects were not significant for Bitcoin. In our study, we could find the negative sentiments expressed in tweets on BTC-INR were slightly higher than the positive tweets on all three days and not substantial (Refer Fig. 6) and the difference went even more narrow on 2nd February 2022 revealing that the positive sentiments were negated by the negative sentiments. Moreover, the lack of strong opinion of both the positive and negative tweets can further be explicated by distribution of subjectivity of tweets. Here, the lack of subjectivity in the observed tweets are concluded due to the plots, where more than half of the subjectivity scores fall below the target value y=0.5 (Refer Fig. 7). When we compare the results of event study and sentiment analysis, we can corroborate
that sentiments expressed in twitter forums affect the financial market. This result is in tandem with the results of
different studies by [4] [5] [6]. In our case we may conclude that the twitter sentiments did not result in any abnormal
returns in the BTC-INR market as the positive and negative sentiments negate one another, as similarly shown in [8]
and we have further expanded the cause by observing the lack of subjectivity in both the positive and negative tweets.

XII. SCOPE FOR FUTURE RESEARCH

Our research is robust as we have adopted a comparative analysis of multiple machine learning models in classi-
fying the sentiments of twitter posts and have chosen Logistic Regression which has the highest accuracy - while at
the same time comparing the models with evaluation metrics that are even beyond accuracy scores, where Logistic
Regression still holds to be the best model. We have studied the linkage between sentiment analysis and event study
in order to know whether sentiments expressed get translated in the market movement. We have restricted only to
BTC-INR as bitcoin is the most traded cryptocurrency in India

(https://www.forbes.com/advisor/in/investing/top-10-cryptocurrencies-in-india/).

Future studies may be extended to other cryptocurrencies and their movement in the market in the Indian. With respect to event studies we have gone for an estimation window of 120 days and 10 days prior and post the event day. The analysis can be carried out in a minute by minute basis to understand the short term movements in the price. The Sentiment analysis can also be done in three parts – Anticipation period, Event day and Adjustment period to decipher any abnormality in a detailed manner.

References

[1] D.-E. Diaconasu, S. Mehdian, O. Stoica, An analysis of investors’ behavior in bitcoin market, PLOS ONE 17 (3) (2022) 1–18. doi:10.1371/journal.pone.0264522. URL https://doi.org/10.1371/journal.pone.0264522 [2] B. Szetela, G. Mentel, Y. Bilan, U. Mentel, The relationship between trend and volume on the bitcoin market, Eurasian Economic Review 11 (1) (2021) 25–42. doi:10.1007/s40822-021-00166-5. [3] M. Makrehchi, S. Shah, W. Liao, Stock prediction using event-based sentiment analysis, 2013, pp. 337–342. doi:10.1109/WI-IAT.2013.48. [4] T. T. P. Souza, O. Kolchyna, P. C. Treleaven, T. Aste, Twitter sentiment analysis applied to finance: A case study in the retail industry, ArXiv abs/1507.00784. [5] G. Ranco, D. Aleksovski, G. Caldarelli, M. Gr?car, I. Mozeti?c, The effects of twitter sentiment on stock price returns, PLOS ONE 10 (9) (2015) 1–21. doi:10.1371/journal.pone.0138441. URL https://doi.org/10.1371/journal.pone.0138441 [6] W. Yue, S. Zhang, Q. Zhang, Asymmetric News Effects on Cryptocurrency Liquidity: an Event Study Perspective, Finance Research Letters 41 (C). doi:10.1016/j.frl.2020.101799. URL https://ideas.repec.org/a/eee/finlet/v41y2021ics1544612320316135.html [7] M. S. Brown, B. Douglass, An event study of the effects of cryptocurrency thefts on cryptocurrency prices, 2020 Spring Simulation Confer- ence (SpringSim) (2020) 1–12. [8] L. Ante, How elon musk’s twitter activity moves cryptocurrency markets, SSRN Electronic Journaldoi:10.2139/ssrn.3778844. [9] A. Agarwal, B. Xie, I. Vovsha, O. Rambow, R. Passonneau, Sentiment analysis of twitter data, 2011. [10] L. Lin, J. Li, R. Zhang, W. Yu, C. Sun, Opinion mining and sentiment analysis in social networks: A retweeting structure-aware approach (2014) 890–895doi:10.1109/UCC.2014.145. [11] R. Prabowo, M. Thelwall, Sentiment analysis: A combined approach, Journal of Informetrics 3 (2) (2009) 143–157. URL https://EconPapers.repec.org/RePEc:eee:infome:v:3:y:2009:i:2:p:143-157 [12] W. Medhat, A. Hassan, H. Korashy, Sentiment analysis algorithms and applications: A survey, Ain Shams Engineering Journal 5 (4) (2014) 1093–1113. doi:https://doi.org/10.1016/j.asej.2014.04.011. URL https://www.sciencedirect.com/science/article/pii/S2090447914000550 [13] D. M. E.-D. M. Hussein, A survey on sentiment analysis challenges, Journal of King Saud University-Engineering Sciences 30 (4) (2018) 330–338. doi:https://doi.org/10.1016/j.jksues.2016.04.002. URL https://www.sciencedirect.com/science/article/pii/S1018363916300071 [14] P. Goncalves, M. Araujo, F. Benevenuto, M. Cha, Comparing and combining sentiment analysis methods, 2013, pp. 27–38. doi:10.1145/2512938.2512951. [15] E. Fersini, F. A. Pozzi, E. Messina, Approval network: A novel approach for sentiment analysis in social networks, World Wide Web 20 (4) (2016) 831–854. doi:10.1007/s11280-016-0419-8. [16] S. J. Brown, J. B. Warner, Using daily stock returns: The case of event studies, Journal of Financial Economics 14 (1) (1985) 3–31. doi:https://doi.org/10.1016/0304-405X(85)90042-X. URL https://www.sciencedirect.com/science/article/pii/0304405X8590042X [17] T. Dyckman, D. Philbrick, J. Stephan, A comparison of event study methodologies using daily stock returns: A simulation approach, Journal of Accounting Research 22 (1984) 1–30. URL http://www.jstor.org/stable/2490855 [18] A. C. MacKinlay, Event studies in economics and finance, Journal of Economic Literature 35 (1) (1997) 13–39. URL http://www.jstor.org/stable/2729691 [19] S. P. Kothari, J. B. Warner, Econometrics of event studies, 2007. [20] F. Sebastiani, Machine learning in automated text categorization, ACM Comput. Surv. 34 (1) (2002) 1–47. doi:10.1145/505282.505283. URL https://doi.org/10.1145/505282.505283 [21] Kazanova, Sentiment140 dataset with 1.6 million tweets (Sep 2017). URL https://www.kaggle.com/datasets/kazanova/sentiment140 [22] R. Patel, K. Passi, Sentiment analysis on twitter data of world cup soccer tournament using machine learning, IoT 1 (2) (2020) 218–239. doi:10.3390/iot1020014.URL https://www.mdpi.com/2624-831X/1/2/14 [23] S.-W. Kim, J.-M. Gil, Research paper classification systems based on tf-idf and lda schemes, Human-centric Computing and Information Sciences 9 (1). doi:10.1186/s13673-019-0192-7. [24] B. E. Boser, I. M. Guyon, V. N. Vapnik, A training algorithm for optimal margin classifiers, in: Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, ACM Press, 1992, pp. 144–152. [25] N. Cristianini, J. Shawe-Taylor, An introduction to support vector machines and other kernel-based learning methods, 2000. [26] T. Yang, K. Qian, D. C.-T. Lo, K. Al Nasr, Y. Qian, Spam filtering using association rules and naïve bayes classifier, in: 2015 IEEE International Conference on Progress in Informatics and Computing (PIC), 2015, pp. 638–642. doi:10.1109/PIC.2015.7489926. [27] K. P. Murphy, et al., Naive bayes classifiers, University of British Columbia 18 (60) (2006) 1–8. [28] I. Rish, An empirical study of the naive bayes classifier, in: IJCAI 2001 workshop on empirical methods in artificial intelligence, Vol. 3, IBM New York, 2001, pp. 41–46. [29] R. O. Duda, P. E. Hart, Pattern classification and scene analysis / Richard O. Duda, Peter E. Hart, Wiley New York, 1973. URL http://www.loc.gov/catdir/enhancements/fy0607/72007008-t.html [30] A. Strzelecka, A. Kurdys-Kujawska, D. Zawadzka, Application of logistic regression models to assess household financial decisions regarding debt, Procedia Computer Science 176 (2020) 3418–3427, knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 24th International Conference KES2020. doi:https://doi.org/10.1016/j.procs.2020.09.055. URL https://www.sciencedirect.com/science/article/pii/S1877050920319505 [31] A. Stanisz, Przystepny kurs statystyki zastosowaniem statistica pl tom 3 (Nov 2007). [32] P. Ranganathan, C. Pramesh, R. Aggarwal, Common pitfalls in statistical analysis: Logistic regression, Perspectives in Clinical Research 8 (2017) 148–151. [33] B. Liu, Sentiment analysis and subjectivity, 2010, pp. 627–666. [34] K. K. Bhagat, S. Mishra, A. Dixit, C.-Y. Chang, Public opinions about online learning during covid-19: A sentiment analysis approach, Sustainability 13 (6). doi:10.3390/su13063346. URL https://www.mdpi.com/2071-1050/13/6/3346

Copyright

Copyright © 2022 Abhinand G, Dr. V. Uma Maheswari. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET43533

Publish Date : 2022-05-29

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here