Optimization of Naïve Bayes Classifier By Implemented Unigram, Bigram, Trigram for Sentiment Analysis of Hotel Review
Main Article Content
Abstract
The information needed in its development requires that proper analysis can provide support in making decisions. Sentiment analysis is a data processing technique that can be completed properly. To make it easy to classify hotels based on sentiment analysis using the Naїve Bayes Classifier algorithm. As a classification tool, Naїve Bayes Classifier is considered efficient and simple. In this study consists of 3 stages of sentiment analysis process. The first stage is text pre-processing which consists of transform case, stopword removal, and stemming. The second stage is the implementation of N-Gram features, namely Unigram, Bigram, Trigram. The N-Gram feature is a feature that contains a collection of words that will be referred to in the next process. Next, the last click is the hotel review classification process using Na menggunakanve Bayes Classifier. OpinRank Hotels Review dataset on Naїve Bayes Classifier using N-Gram namely Unigram, Bigram, Trigram with research results that show Unigram can provide better test results than Bigram and Trigram with an average accuracy of 81.30%.
Downloads
Article Details
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
C. Wang, Y. Zhang, J. Song, Q. Liu and H. Dong, “A novel optimized SVM algorithm based on PSO with saturation and mixed time-delays for classification of oil pipeline leak detection, ” Sys. Sci. & Con. Eng., vol. 7, no. 1, pp. 75-88, 2019.
R. Carter and M. McCarthy, Cambridge Grammar of English Paperback with CD ROM: A Comprehensive Guide. Cambridge, UK: Cambridge University Press, 2006, pp.179-185.
N. Buslim, A. E. Putra, and L. K. Wardhani, “Chi-square feature selection effect on naive bayes classifier algorithm performance for sentiment analysis document, ” presented at the 7th International Conference on Cyber and IT Service Management, Jakarta, Indonesia, Nov. 6-8. 2019.
R. Feldman and J. Sanger, The Text Mining Handbook . Cambridge, UK: Cambridge University Pres. 2009.
W. Zhang and F. Gao, “An Improvement to Naive Bayes for Text Classification, ” Procedia Engineering, vol. 15, no. 2, pp. 2160–2164, 2011.
J. Chen, H. Huang, S. Tian, and Y. Qu, “Feature Selection for Text Classification with Naïve Bayes, ” Expert System Application, vol, 36, no. 3, pp. 5432–5435, 2009.
J. Violos, K. Tserpes, I. Varlamis, and T. Varvarigou, “Text classification using the n-gram graph representation model over high frequency data streams, ” Front. Appl. Math. Stat. vol. 4, no. 41, pp. 1-19, 2018.
Z. Drus and H. Khalid, “ Sentiment Analysis in Social Media and Its Application: Systematic Literature Review, ” presented at the fifth Information Systems International Conference, Surabaya, Indonesia, July 23-24, 2019.
R. E. Banchs. Text Mining with MATLAB®. New Delhi, India: Springer-Verlag New York, 2013, pp. 49-75.
I. Y. R. Pratiwi, R. A. Asmara, and F. Rahutomo, “Study of Hoax News Detection using Naïve Bayes Classifier in Indonesian Language,” presented at the 11th International Conference Informatics Communication Technology System ICTS, Surabaya, Indonesia, Oct. 30-31, 2017.