Accuracy of Malaysia Public Response to Economic Factors During the Covid-19 Pandemic Using Vader and Random Forest

Main Article Content

Jumanto Jumanto
Much Aziz Muslim
Yosza Dasril
Tanzilal Mustaqim

Abstract

This study conducted a sentiment analysis of the impact of the Covid-19 pandemic in the economic sector on people's lives through social media Twitter. The analysis was carried out on 23,777 tweet data collected from 13 states in Malaysia from 1 December 2019 to 17 June 2020. The research process went through 3 stages, namely pre-processing, labeling, and modeling. The pre-processing stage is collecting and cleaning data. Labeling in this study uses Vader sentiment polarity detection to provide an assessment of the sentiment of tweet data which is used as training data. The modeling stage means to test the sentiment data using the random forest algorithm plus the extraction count vectorizer and TF-IDF features as well as the N-gram selection feature. The test results show that the polarity of public sentiment in Malaysia is predominantly positive, which is 11,323 positive, 4105 neutral, and 8349 negative based on Vader labeling. The accuracy rate from the random forest modeling results was obtained 93.5 percent with TF-IDF and 1 gram.

Article Details

How to Cite
Jumanto, J., Muslim, M. A., Dasril, Y., & Mustaqim, T. (2022). Accuracy of Malaysia Public Response to Economic Factors During the Covid-19 Pandemic Using Vader and Random Forest. Journal of Information System Exploration and Research, 1(1), 49 - 70. https://doi.org/10.52465/joiser.v1i1.104
Section
Articles

References

R. Korolov et al., “On predicting social unrest using social media,” Proc. 2016 IEEE/ACM Int. Conf. Adv. Soc. Networks Anal. Mining, ASONAM 2016, 2016, pp. 89–95, doi: 10.1109/ASONAM.2016.7752218. https://doi.org/10.1109/ASONAM.2016.7752218

N. F. C. Mat, H. A. Edinur, M. K. A. A. Razab, and S. Safuan, “A single mass gathering resulted in massive transmission of COVID-19 infections in Malaysia with further international spread,” J. Travel Med., Vol. 27, No. 3, 2020, pp. 1–4, doi: 10.1093/jtm/taaa059. https://doi.org/10.1093/jtm/taaa059

M. W. Hasanat, A. Hoque, F. A. Shikha, M. Anwar, A. B. A. Hamid, and H. H. Tat, “The Impact of Coronavirus (Covid-19) on E-Business in Malaysia,” Asian J. Multidiscip. Stud., Vol. 3, No. 1, 2020, pp. 85–90. https://www.researchgate.net/publication/340445932_The_Impact_of_Coronavirus_Covid-19_on_E-Business_in_Malaysia

V. Balakrishnan, S. Khan, and H. R. Arabnia, “Improving cyberbullying detection using Twitter users’ psychological features and machine learning,” Comput. Secur., Vol. 90, 2020, p. 101710, doi: 10.1016/j.cose.2019.101710. https://doi.org/10.1016/j.cose.2019.101710

S. Vashishtha and S. Susan, “Fuzzy rule based unsupervised sentiment analysis from social media posts,” Expert Syst. Appl., Vol. 138, 2019, doi: 10.1016/j.eswa.2019.112834.

V. A. and S. S. Sonawane, “Sentiment Analysis of Twitter Data: A Survey of Techniques,” Int. J. Comput. Appl., Vol. 139, No. 11, 2016, pp. 5–15, doi: 10.5120/ijca2016908625.

N. Öztürk and S. Ayvaz, “Sentiment analysis on Twitter: A text mining approach to the Syrian refugee crisis,” Telemat. Informatics, Vol. 35, No. 1, 2018, pp. 136–147, doi: 10.1016/j.tele.2017.10.006.

S. Al-khalifa, I. Aljarah, and M. A. M. Abushariah, “Hate Speech Classification in Arabic Tweets,” J. Theor. Appl. Inf. Technol., Vol. 98, No. 11, 2020, pp. 1816–1831.

S. Das, R. K. Behera, M. Kumar, and S. K. Rath, “Real-Time Sentiment Analysis of Twitter Streaming data for Stock Prediction,” Procedia Comput. Sci., Vol. 132, No. Iccids, 2018, pp. 956–964, doi: 10.1016/j.procs.2018.05.111.

E. Kušen and M. Strembeck, “Politics, sentiments, and misinformation: An analysis of the Twitter discussion on the 2016 Austrian Presidential Elections,” Online Soc. Networks Media, Vol. 5, 2018, pp. 37–50, doi: 10.1016/j.osnem.2017.12.002.

D. Liu and L. Lei, “The appeal to political sentiment: An analysis of Donald Trump’s and Hillary Clinton’s speech themes and discourse strategies in the 2016 US presidential election,” Discourse, Context Media, Vol. 25, 2018, pp. 143–152, doi: 10.1016/j.dcm.2018.05.001.

H. M. K. Kumar and B. S. Harish, “Sarcasm classification: A novel approach by using Content Based Feature Selection Method,” Procedia Comput. Sci., Vol. 143, 2018, pp. 378–386, doi: 10.1016/j.procs.2018.10.409.

T. Mustaqim, K. Umam, and M. A. Muslim, “Twitter text mining for sentiment analysis on government’s response to forest fires with vader lexicon polarity detection and k-nearest neighbor algorithm,” 2020, pp. 8–15, doi: 10.1088/1742-6596/1567/3/032024.

S. Elbagir and J. Yang, “Twitter sentiment analysis using natural language toolkit and Vader sentiment,” Lect. Notes Eng. Comput. Sci., Vol. 2239, 2019, pp. 12–16.

C. J. Hutto and E. E. Gilbert, “VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media (ICWSM-14).”,” Proc. 8th Int. Conf. Weblogs Soc. Media, ICWSM 2014, 2014, [Online]. Available: http://sentic.net/.

S. Al-Natour and O. Turetken, “A comparative assessment of sentiment analysis and star ratings for consumer reviews,” Int. J. Inf. Manage., Vol. 54, No. August 2019, 2020, p. 102132, doi: 10.1016/j.ijinfomgt.2020.102132.

A. R. Alaei, S. Becken, and B. Stantic, “Sentiment Analysis in Tourism: Capitalizing on Big Data,” J. Travel Res., Vol. 58, No. 2, 2019, pp. 175–191, doi: 10.1177/0047287517747753.

A. Liaw and M. Wiener, “Classification and Regression by randomForest,” R news, Vol. 2, No. 3, 2002, pp. 18–22, doi: 10.1177/154405910408300516.

Y. Al Amrani, M. Lazaar, and K. E. El Kadirp, “Random forest and support vector machine based hybrid approach to sentiment analysis,” Procedia Comput. Sci., Vol. 127, 2018, pp. 511–520, doi: 10.1016/j.procs.2018.01.150.

V. A. Fitri, R. Andreswari, and M. A. Hasibuan, “Sentiment analysis of social media Twitter with case of Anti-LGBT campaign in Indonesia using Naïve Bayes, decision tree, and random forest algorithm,” Procedia Comput. Sci., Vol. 161, 2019, pp. 765–772, doi: 10.1016/j.procs.2019.11.181.

H. Parmar, S. Bhanderu, and G. Shah, “Sentiment Mining of Movie Reviews using Random Forest with Tuned Hyperparameters,” 2014.

T. Mustaqim, “Analysis of Public Opinion on Religion and Politics in Indonesia using K-Means Clustering and Vader Sentiment Polarity Detection,” in Proceeding International Conference on Science and Engineering, 2020, Vol. 3, pp. 749–754.

S. Dutta, J. Ma, and M. De Choudhury, “Measuring the impact of anxiety on online social interactions,” 12th Int. AAAI Conf. Web Soc. Media, ICWSM 2018, 2018, pp. 584–587.

M. A. Hassonah, R. Al-Sayyed, A. Rodan, A. M. Al-Zoubi, I. Aljarah, and H. Faris, “An efficient hybrid filter and evolutionary wrapper approach for sentiment analysis of various topics on Twitter,” Knowledge-Based Syst., Vol. 192, 2020, p. 105353, doi: 10.1016/j.knosys.2019.105353.

S. Behrendt and A. Schmidt, “The Twitter myth revisited: Intraday investor sentiment, Twitter activity and individual-level stock return volatility,” J. Bank. Financ., Vol. 96, 2018, pp. 355–367, doi: 10.1016/j.jbankfin.2018.09.016.

E. S. Tellez, S. Miranda-Jiménez, M. Graff, D. Moctezuma, O. S. Siordia, and E. A. Villaseñor, “A case study of Spanish text transformations for twitter sentiment analysis,” Expert Syst. Appl., Vol. 81, 2017, pp. 457–471, doi: 10.1016/j.eswa.2017.03.071.

A. L. Uitdenbogerd, “World cloud: A prototype data choralification of text documents,” J. New Music Res., Vol. 48, No. 3, 2019, pp. 253–263, doi: 10.1080/09298215.2019.1606255.

T. Davidson, D. Warmsley, M. Macy, and I. Weber, “Automated hate speech detection and the problem of offensive language,” Proc. 11th Int. Conf. Web Soc. Media, ICWSM 2017, No. Icwsm, 2017, pp. 512–515.

H. Saif, Y. He, M. Fernandez, and H. Alani, “Contextual semantics for sentiment analysis of Twitter,” Inf. Process. Manag., Vol. 52, No. 1, 2016, pp. 5–19, doi: 10.1016/j.ipm.2015.01.005.

M. Ghiassi and S. Lee, “A domain transferable lexicon set for Twitter sentiment analysis using a supervised machine learning approach,” Expert Syst. Appl., Vol. 106, 2018, pp. 197–216, doi: 10.1016/j.eswa.2018.04.006.

T. M. Oshiro, P. S. Perez, and J. A. Baranauskas, “How many trees in a random forest?,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), Vol. 7376 LNAI, 2012, pp. 154–168, doi: 10.1007/978-3-642-31537-4_13.

V. Q. Pham, T. Kozakaya, O. Yamaguchi, and R. Okada, “COUNT forest: Co-voting uncertain number of targets using random forest for crowd density estimation,” Proc. IEEE Int. Conf. Comput. Vis., Vol. 2015 International Conference on Computer Vision, ICCV 2015, 2015, pp. 3253–3261, doi: 10.1109/ICCV.2015.372.

S. Del Río, V. López, J. M. Benítez, and F. Herrera, “On the use of MapReduce for imbalanced big data using Random Forest,” Inf. Sci. (Ny)., Vol. 285, No. 1, 2014, pp. 112–137, doi: 10.1016/j.ins.2014.03.043.

V. Y. Kullarni and P. K. Sinha, “Random Forest Classifier: A Survey and Future Research Directions,” Int. J. Adv. Comput., Vol. 36, No. 1, 2013, pp. 1144–1156.

R. Nimesh, P. Veera Raghava, S. Prince Mary, and B. Bharathi, A Survey on Opinion Mining and Sentiment Analysis, Vol. 590, No. 1. 2019.

S. K and F. F, “Survey on aspect-level sentiment analysis,” IEEE Trans. Knowl. Data Eng., Vol. 28, No. 3, 2016, pp. 813–830.

E. Gabarron, E. Dorronzoro, O. Rivera-Romero, and R. Wynn, “Diabetes on Twitter: A Sentiment Analysis,” J. Diabetes Sci. Technol., Vol. 13, No. 3, 2019, pp. 439–444, doi: 10.1177/1932296818811679.

G. A. Ruz, P. A. Henríquez, and A. Mascareño, “Sentiment analysis of Twitter data during critical events through Bayesian networks classifiers,” Futur. Gener. Comput. Syst., Vol. 106, 2020, pp. 92–104, doi: 10.1016/j.future.2020.01.005.

L. Terán and J. Mancera, “Dynamic profiles using sentiment analysis and twitter data for voting advice applications,” Gov. Inf. Q., Vol. 36, No. 3, 2019, pp. 520–535, doi: 10.1016/j.giq.2019.03.003.

M. Z. Ansari, M. B. Aziz, M. O. Siddiqui, H. Mehra, and K. P. Singh, “Analysis of Political Sentiment Orientations on Twitter,” Procedia Comput. Sci., Vol. 167, 2020, pp. 1821–1828, doi: 10.1016/j.procs.2020.03.201.

P. Tiwari et al., Sentiment Analysis for Airlines Services Based on Twitter Dataset. Elsevier Inc., 2019.

Y. Wan and Q. Gao, “An Ensemble Sentiment Classification System of Twitter Data for Airline Services Analysis,” Proc. - 15th IEEE Int. Conf. Data Min. Work. ICDMW 2015, No. March, 2016, pp. 1318–1325, doi: 10.1109/ICDMW.2015.7.

Abstract viewed = 622 times