Support Vector Machine (SVM) Optimization Using Grid Search and Unigram to Improve E-Commerce Review Accuracy
Main Article Content
Abstract
Electronic Commerce (E-Commerce) is distributing, buying, selling, and marketing goods and services over electronic systems such as the Internet, television, websites, and other computer networks. E-commerce platforms such as amazon.com and Lazada.co.id offer products with various price and quality. Sentiment analysis used to understand the product’s popularity based on customers’ reviews. There are some approaches in sentiment analysis including machine learning. The part of machine learning that focuses on text processing called text mining. One of the techniques in text mining is classification and Support Vector Machine (SVM) is one of the frequently used algorithms to perform classification. Feature and parameter selection in SVM significantly affecting the classification accuracy. In this study, we chose unigram as the feature extraction and grid search as parameter optimization to improve SVM classification accuracy. Two customer review datasets with different language are used which is Amazon reviews that written in English and Lazada reviews in the Indonesian language. 10-folds cross validation and confusion matrix are used to evaluating the experiment results. The experiment results show that applying unigram and grid search on SVM algorithm can improve Amazon review accuracy by 26,4% and Lazada reviews by 4,26%.
Downloads
Article Details
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
T.U Haque, N. N, Saber, and F. M. Shah, “Sentiment analysis on large scale Amazon product reviews. ” presented at Int. Conf. on Inno. Res. & Dev., Bangkok, Thailand., May 11-12, 2018.
J. Zhan, H. Tong, and Y. Liu, Y. “Gather customer concerns from online product reviews – A text summarization approach., ” Expert Systems With Applications, vol. 36, no. 2, pp. 2107–2115. 2009
U. L. Larasati, M. A. Muslim, R. Arifudin, and Alamsyah, “Improve the Accuracy of Support Vector Machine Using Chi Square Statistic and Term Frequency Inverse Document Frequency on Movie Review Sentiment Analysis ” Scientific Journal of Informatics, vol. 6, no. 1, pp. 138-149. 2019.
H. C. Yang, and C. H. Lee, “A text mining approach for automatic construction of hypertexts, ” Expert Systems with Applications, vol. 29, no. 4, pp. 723–734. 2005.
A. F. Indriani, and Muslim, M. A, “SVM Optimization Based on PSO and AdaBoost to Increasing Accuracy of CKD Diagnosis, ” Lontar Komputer: Jurnal Ilmiah Teknologi Informasi, DOI: 10.24843/LKJITI.2019.v10.i02.p06. 2019
E. Laoh, I. Surjandari, and N. I. Prabaningtyas, “Enhancing hospitality sentiment reviews analysis performance using SVM N-grams method, ” presented at 16th Int. Conf. on Service Systems and Service Management, Depok, Indonesia, July 15-18, 2019.
M. A. Muslim, B. Prasetiyo, B., E. Listiana, E. L. H. Mawarni, A. Juli, Mirqotussa’adah., S. H. Rukmana, and A. Nurzahputra, Data Mining Algoritma C4.5, Semarang: CV. Pilar Nusantara, 2019.
A. Ravi, A. R. Khettry, and S. Y. Sethumadhavachar, “Amazon Reviews as Corpus for Sentiment Analysis Using Machine Learning” presented at Int. Conf. on Adv. in Comp. & Data Sci, Ghazibad, India, April 12-13, 2019
S. Amari, and S. Wu, “Improving support vector machine classifiers by modifying kernel functions, ” Neural Networks, vol. 12, no. 6, pp. 783–789, 1999.
K. Srivastava, and L. Bhambhu, “Data classification using support vector machine, ” J of Theo. & Appl. Inf. Tech., vol 1, no. 5, pp. 1–7. 2009.
A. Tharwat, “Parameter investigation of support vector machine classifier with kernel functions, ” Knowledge and Information Systems. vol. 6, no. 2, pp. 24-31, 2019.
J. Alex, and B. S. Smola, “A tutorial on support vector regression” Statistics and Computing, vol. 14, no. 3, pp. 199–222. 2004.
A. Zakrani, A. Najm, and A. Marzak, “Support Vector Regression Based on Grid-Search Method for Agile Software Effort Prediction, ” Colloquium in Information Science and Technology, vol. 8, no. 2, pp. 26-32. 2018.
R. Feldman, and J. Sanger. The Text Mining Handbook. New York: Cambridge University Press. 2006
M. Agyemang, K. Barker, and R.S. Alhajj, “Mining web content outliers using structure oriented weighting techniques and N-grams, ” Proceedings of the ACM Symposium on Applied Computing, Santa Fe, New Mexico, USA, March 13-17, 2005.
I. Syarif, I., A. Prugel-Bennett, and G. Wills., “SVM parameter optimization using grid search and genetic algorithm to improve classification performance, ”.Telkomnika, vol. 14, no. 4, pp.1502–1509, 2016.
S. W. Lin, K. C. Ying, S. C. Chen, and Z. J. Lee, Z. J,” Particle swarm optimization for parameter determination and feature selection of support vector machines, ” Expert Systems with Applications, vol. 35, no. 4, pp. 1817–1824, 2009.