Comparison of the performance of naive bayes and support vector machine in sirekap sentiment analysis with the lexicon-based approach

Main Article Content

Ramadhana Setiyawan
Zaenal Mustofa

Abstract

The general public often uses the SiRekap application to see the progress of the election and to provide critical statements. Policies made by the government have good and bad outcomes, and users end up leaving their reviews and ratings on the Google Play Store, where the app can be downloaded. These reviews can be collected and processed into useful information such as sentiment analysis using Naïve Bayes and Support Vector Machine methods. Both methods have differences during training and during evaluation. The difference in results from the various scenarios tested was not much different. When training the Support Vector Machine model is able to process comment data labeled with a lexicon 10% better than the Naïve Bayes model by looking at the results of the accuracy of the two models. During the accuracy evaluation process, the two models produce the same accuracy of 72%. Although both models get the same accuracy during the evaluation process, there are differences in precision, recall, and f1 score. The difference is that the Support Vector Machine model is 5% better for precision, 8% for recall, and 3% for f1-score compared to the Naïve Bayes model. This research is limited to only knowing the performance of two machine learning models, namely the use of naive bayes and svm by using a label lexicon. The results obtained can be improved for the better. Improving the evaluation results can be done by adding data or using text data augmentation and there is creation from experts related to language sentiment.

Downloads

Download data is not yet available.

Article Details

How to Cite
[1]
R. Setiyawan and Z. Mustofa, “Comparison of the performance of naive bayes and support vector machine in sirekap sentiment analysis with the lexicon-based approach”, J. Soft Comput. Explor., vol. 5, no. 2, pp. 122-132, Jun. 2024.
Section
Articles

References

I. A. Pradesa, “Analisis Penggunaan Sistem Rekapitulasi Suara (Sirekap) Dalam Menghadapi Problematika Pemilu 2024,” Triwikrama J. Multidisiplin Ilmu Sos., vol. 03, no. 04, pp. 47–57, 2024.

M. Nurkamiden, “SiRekap : Tantangan dan Potensi Kekeliruan Proses Rekapitulasi Pemilu Serentak di Indonesia SiRekap : Challenges and Potential Errors in the Recapitulation Process of Simultaneous Elections in Indonesia,” vol. 1, no. c, pp. 101–110, 2024.

R. Naquitasia, D. H. Fudholi, and L. Iswari, “Analisis Sentimen Berbasis Aspek pada Wisata Halal dengan Metode Deep Learning,” J. Teknoinfo, vol. 16, no. 2, p. 156, 2022, doi: 10.33365/jti.v16i2.1516.

A. Permana Putra and A. Farrah Syafira, “Analisis Sentimen Data Twitter Topik Politik Dengan Metode Naive Bayes Dan Convolutional Neural Networks (Cnn),” J. Ilm. Wahana Pendidik., vol. 9, no. 20, pp. 36–41, 2023.

D. F. Salsabillah, D. E. Ratnawati, and N. Y. Setiawan, “PERBANDINGAN ALGORITMA SUPPORT VECTOR MACHINE DENGAN NAÏVE BAYES ( STUDI KASUS : AYAM GORENG NELONGSO CABANG SINGOSARI , MALANG ) SENTIMENT ANALYSIS OF RESTAURANT REVIEWS USING COMPARISON OF SUPPORT VECTOR MACHINE ALGORITHM WITH NAÏVE BAYES ( CASE STUDY :,” vol. 11, no. 1, pp. 107–116, 2024, doi: 10.25126/jtiik.20241117584.

R. A. A. Renal, Syariful Alam, and Moch Hafid T, “Komparasi Payment Digital Untuk Analisis Sentimen Berdasarkan Ulasan Di Google Playstore Menggunakan Metode Support Vector Machine,” STORAGE J. Ilm. Tek. dan Ilmu Komput., vol. 2, no. 3, pp. 118–128, 2023, doi: 10.55123/storage.v2i3.2337.

Ernianti Hasibuan and Elmo Allistair Heriyanto, “Analisis Sentimen Pada Ulasan Aplikasi Amazon Shopping Di Google Play Store Menggunakan Naive Bayes Classifier,” J. Tek. dan Sci., vol. 1, no. 3, pp. 13–24, 2022, doi: 10.56127/jts.v1i3.434.

A. Palanivinayagam, C. Z. El-Bayeh, and R. Damaševičius, “Twenty Years of Machine-Learning-Based Text Classification: A Systematic Review,” Algorithms, vol. 16, no. 5, pp. 1–28, 2023, doi: 10.3390/a16050236.

Fajri Koto and Gemala Y. Rahmaningtyas, “InSet Lexicon: Evaluation of a Word List for Indonesian Sentiment Analysis in Microblogs,” 2017 Int. Conf. Asian Lang. Process., pp. 391–394, 20AD.

Z. A. Khan and V. Rekha, “Fake News Detection Using TF-IDF Weighted with Word2Vec: An Ensemble Approach,” Int. J. Intell. Syst. Appl. Eng., vol. 11, no. 3, pp. 1065–1076, 2023.

A. F. Firdaus and W. I. Firdaus, “Text Mining Dan Pola Algoritma Dalam Penyelesaian Masalah Informasi : (Sebuah Ulasan),” JUPITER J. Penelit. Ilmu dan Teknol. Komput., vol. 13, no. 1, pp. 66–78, 2021.

I. Ho, H. N. Goh, and Y. F. Tan, “Preprocessing Impact on Sentiment Analysis Performance on Malay Social Media Text,” J. Syst. Manag. Sci., vol. 12, no. 5, pp. 73–90, 2022, doi: 10.33168/JSMS.2022.0505.

P. Nandwani and R. Verma, “A review on sentiment analysis and emotion detection from text,” Soc. Netw. Anal. Min., vol. 11, no. 1, pp. 1–19, 2021, doi: 10.1007/s13278-021-00776-6.

D. E. Cahyani and I. Patasik, “Performance comparison of tf-idf and word2vec models for emotion text classification,” Bull. Electr. Eng. Informatics, vol. 10, no. 5, pp. 2780–2788, 2021, doi: 10.11591/eei.v10i5.3157.

S. Styawati, A. R. Isnain, N. Hendrastuty, and L. Andraini, “Comparison of Support Vector Machine and Naïve Bayes on Twitter Data Sentiment Analysis,” J. Inform. J. Pengemb. IT, vol. 6, no. 1, pp. 56–60, 2021, doi: 10.30591/jpit.v6i1.3245.

J. Cervantes, F. Garcia-Lamont, L. Rodríguez-Mazahua, and A. Lopez, “A comprehensive survey on support vector machine classification: Applications, challenges and trends,” Neurocomputing, vol. 408, pp. 189–215, 2020, doi: 10.1016/j.neucom.2019.10.118.

M. Luo and L. Luo, “Feature selection for text classification using OR+SVM-RFE,” 2010 Chinese Control Decis. Conf. CCDC 2010, pp. 1648–1652, 2010, doi: 10.1109/CCDC.2010.5498331.

O. Jing, “Research on English Text Information Filtering Algorithm Based on SVM,” Proc. 2020 IEEE Int. Conf. Power, Intell. Comput. Syst. ICPICS 2020, pp. 1001–1004, 2020, doi: 10.1109/ICPICS50287.2020.9202016.

D. Valero-Carreras, J. Alcaraz, and M. Landete, “Comparing two SVM models through different metrics based on the confusion matrix,” Comput. Oper. Res., vol. 152, no. December 2022, p. 106131, 2023, doi: 10.1016/j.cor.2022.106131.

H. Yun, “Prediction model of algal blooms using logistic regression and confusion matrix,” Int. J. Electr. Comput. Eng., vol. 11, no. 3, pp. 2407–2413, 2021, doi: 10.11591/ijece.v11i3.pp2407-2413.

Abstract viewed = 228 times