Analysis of twitter sentiment in COVID-19 era using fuzzy logic method

ABSTRACT


INTRODUCTION
This year there has been a change in lifestyle almost all over the world. This is due to the emergence of a new type of virus that attacks humans, namely Coronavirus Disease 2019 . Coronavirus itself is a group of viruses from the Ortho Coronavirus subfamily that can cause disease not only in humans but also in birds and mammals [1]. This virus then spread from the Chinese city of Wuhan to cause a large and prolonged pandemic all over the world until now. At the same time, the current condition of the Covid-19 pandemic affects all areas of life from the economy, society, politics, education, and so on.
Because the Covid-19 pandemic did not end, the government then implemented Large-Scale Social Restrictions (PSBB) which impacted the community to implement work from home (WFH) and distance learning for students and office workers. Therefore, to support these conditions, people are required to use gadgets. Over time, people are more active in socializing indirectly in cyberspace via the internet which makes social media users increase. Now more than 100 million active users open applications such as social media [2].
Social media is a persistent, internet-based, personal mass communication channel that facilitates the perception of interaction among users, derives value primarily from user-generated content, and is untrained [3]. The most commonly used social media networks such as Facebook, Youtube. Instagram, and Twitter [4]. Twitter itself has become one of the most popular social media in the world with more than 200 million active users and 10.6 billion tweets worldwide. Each user can easily provide their arguments, stories, expressions, J. Soft. Comp. Explor., Vol. 2, No. 1, March 2021: 1 -5 DOI: breaking news, or hot topics via Twitter. Indonesia, as one of the largest countries in the world with a population of more than 200 million people and has a large number of active Twitter users as well [5].
Social media, especially Twitter, have become increasingly popular with tweets discussing the Covid-19 pandemic in cyberspace with the emergence of hashtags about events that occurred during the Covid-19 pandemic making it a trending topic. That way, many responses or perceptions emerge from the community. The data obtained from Twitter will later be processed and analyzed to be useful for society and an organization. Sentiment analysis is used in this study to analyze arguments, stories, expressions of an individual on events that occurred during the Covid-19 pandemic through text mining data posts on Twitter. Sentiment analysis is a study that analyzes people's arguments and opinions towards entities such as services in text or products [6], [7].
Based on the description above, this research will apply a sentiment analysis based on an opinion by sorting the topics from those that are often discussed to those that are rarely discussed. This research uses fuzzy logic to design, create and build bots that can analyze user opinions on Twitter. Fuzzy logic is useful for processing and evaluating information [8]. The purpose of this study is to analyze public sentiment towards events during Covid-19, to find out the trending order of topics on Twitter during the pandemic, and to design, build, and build bots that can analyze user opinions on Twitter.

LITERATURE REVIEW
In a systematic literature review based on the reviewed papers, 7 papers are using the lexicon method, 10 papers using the machine learning method, and 7 papers mixing the two when applying sentiment analysis. The lexicon-based method is known as the unsupervised learning method. When conducting sentiment analysis most of the studies adopted the Sentiwordnet and TF-IDR methods. The Sentiwordnet method calculates based on positive or negative words. Meanwhile, the TF-IDR method converts numbers into words and is calculated using the frequency inversion document frequency method. Machine learning methods require training data to be processed. The methods that are often used in machine learning methods are the Naive Bayes and SVM, models. To improve results, combine the lexicon and machine learning methods. The data taken is mostly from the social media site Twitter [9].
The CorE Q9 bootstrap algorithm to find semantic lexicons that can be used to divide tweets into two categories: stressed and non-stressed. The Twitter data used in this article is collected via the Twitter API from January to April 2020 for the continental United States. One of the main innovations in our research is mapping the symptoms of stress that causes COVID-19 on a temporal scale. The algorithm takes a large, unnamed corpus from which it finds new related words and writes them into the wrong semantic category (for example, stress and non-stress in the case). Before the bootstrapping process, the pattern extraction was carried out on the unmarked corpus. It is used to extract all subject noun phrases, direct objects, or prepositional phrases. A universal sentence encoder is used to generate word embeddings. These text embeddings transform the tweet into a numeric vector, encoding the tweet text into the high-dimensional vector needed to find semantic similarities and perform classification tasks. The classifications used in the training process are SVM, logistic regression, naïve Bayes classification, and simple neural networks. By doing this, we were able to observe the spatiotemporal patterns of stress symptoms and answer questions about what are the main concerns regarding pandemics in different geographic areas on different time scales [10].
Sentiment analysis on Twitter users about the anti-LGBT campaign in Indonesia by seeking positive, negative, or neutral responses. Then use the Naive Bayes algorithm because it has a high degree of accuracy in analyzing sentiment. In this study, it can be concluded that on average giving neutral comments with an accuracy of 86.43% obtained through the Rapid Miner tool [11].

METHOD
This research is a study by analyzing the responses or perceptions of the community in a situation that occurs. Therefore, this study uses two sources of information, namely, the collection of data on public sentiment on Twitter social media through hashtags related to events during the Covid-19 pandemic. The population obtained from Twitter residents was 212 tweets. This data collection is using a random sampling method. Furthermore, this study also uses literature reviews such as journals or books related to the research being carried out.
After determining the selected data source, data collection is then carried out. As previously mentioned, data was obtained from the Twitter social media application using queries or special hashtags regarding events that occurred during Covid-19. The use of queries here is to make it easier to find various public sentiments about events that occurred during Covid-19.
After the data is collected, it performs calculations using the fuzzy method. Fuzzy itself is seen as being able to decipher an input into an output without neglecting the existing aspects [12], [13]. ISSN: 2746-0991  3

Analysis of Twitter Sentiment in Covid-19 Era Using Fuzzy Logic Method (Devi Ajeng Efrilianda)
Fuzzy logic itself is considered very easy to apply because of its flexible nature and can be based on human logic with everyday language that is easy to understand [4], [14]. This fuzzy calculation begins with converting linguistic variables into functions that describe variables in fuzzy form. Each of these functions represents a linguistic variable and describes a fuzzy set combined with certain criteria. After the set is described, the next step is to determine the if-then which explains how the relationship between the fuzzy set, namely max (representing and) and min (representing or). In the connection completion step, all estimates are combined on the final fuzzy set. The simplest form of data collected is text on the Twitter application. Searching for topics on social media Twitter will also be easier with the hashtags of topics or related events. So, as mentioned earlier, this study collects data through hashtags used by Twitter users which will later be analyzed through data mining methods. Data mining itself is a method of finding information or patterns in selected data [15], [16].

Text analysis
Anyone can manually analyze a sentimental tweet. However, over time, there are more and more users of the Twitter social media application. This of course causes the analysis process to be less concise and timeconsuming. However, this can be made easy with information technology. This process uses the fuzzy method. This method has the ability in the process of language reasoning (linguistic reasoning) [17]- [19]. In this process, we get 200 data which consists of which we sort into negative tweets and positive tweets.

Twitter API
Application Programming Interface or what we often API is a program used to retrieve or modify data [20]. The program provided by Twitter is intended to make it easier for someone to obtain information available on Twitter. The Twitter API itself will help in selecting data so that later the data used will be more concise. Some of the data obtained will be deleted later. This process is often called the preprocessing process [21], [22]. After that, proceed to the Case Folding Process where we will improve the tweet data that we get. The fix here is intended to homogenize the text into lowercase. Then it will be continued with the data cleaning process. This process aims to remove the RT component (retweet), URL, and username. Next, the last one is Stopword Removal. Stopword itself is a word that is in the data but is less helpful in the process of tweeting analysis [23].

Fuzzy Application
The first step before we start applying fuzzy in this research is connecting the data with the existing database. Next, we enter the nature and hashtags text which will later be processed into text, and hashtags which will later be processed using fuzzy logic. At this stage, it will produce positive and negative scores for each tweet. After getting both of them, we do the calculation of each score which will then be subtracted from each other and produce the final score. After that, the data normalization stage is carried out. This data normalization aims to assist in data measurement and reduce redundancies [20]. After the weighting is done, we will begin to count and classify the data into several categories, namely very positive tweets, positive tweets, neutral tweets, negative tweets, and very negative tweets. The information about tweet analysis model can be seen at figure 1.

RESULT AND DISCUSSION
First, the tweet will be extracted or deciphered, using the fuzzy logic method, the sentiment value of the tweet will be calculated. If you have a tweet value, you can calculate the arithmetic average value. The results of this will know the percentage of positive, neutral, and negative values. Many of the conversations that occurred during Covid-19 discussed economic topics, case reports, and health. In [24] the authors analyzed Twitter data in the period 2 February-15 March 2020 from the identification of topics, 10 of which had positive sentiments and 2 were negative. Case [25] analyzed data from 01 January -23 March found a greater number of positive and neutral tweets. Meanwhile, negative tweets were few. This is because people think the Covid-19 virus will end soon. However, as time went on negative tweets grew more and more.
Overall sentiment about Covid-19 is growing rapidly and the problem of the pandemic has become more complex as time goes on. In this study, the results showed that the sentiment analysis during the Covid-19 period was still dominant with positive tweets. As many as 48% of tweets are positive, 30% are negative tweets and 22% are neutral tweets. The use of applications to identify tweet sentiment during Covid-19 uses a combination of fuzzy logic methods with artificial intelligence. With the help of the Twitter API, you can get tweet data during the Covid-19 pandemic so you can find out the frequently used tweet sentiments.

CONCLUSION
Based on calculations using the fuzzy method, the results show that during the Covid-19 pandemic this study found 48% positive tweets, 30% negative tweets, and 22% neutral tweets. It can be concluded that there are more positive tweets than negative and neutral tweets. This is because people think that the pandemic will not last long, even though there are more and more negative tweets every day.