Improved Accuracy of Naive Bayes Classifier for Determination of Customer Churn Uses SMOTE and Genetic Algorithms

Main Article Content

Afifah Ratna Safitri
Much Aziz Muslim

Abstract

With increasing competition in the business world, many companies use data mining  techniques to determine the level of customer loyalty. The customer data used in this  study is the german credit dataset obtained from UCI. Such data have an imbalance  problem of class because the amount of data in the loyal class is more than in the  churn class. In addition, there are some irrelevant attributes for customer  classification, so attributes selection is needed to get more accurate classification  results. One classification algorithm is naive bayes. Naive Bayes has been used as an  effective classification for years because it is easy to build and give an independent  attribute into its structure. The purpose of this study is to improve the accuracy of the  Naive Bayes for customer classification. SMOTE and genetic algorithm do for  improving the accuracy. The SMOTE is used to handle class imbalance problems,  while the genetic algorithm is used for attributes selection. Accuracy using the Naive  Bayes is 47.10%, while the mean accuracy results obtained from the Naive Bayes  with the application of the SMOTE is 78.15% and the accuracy obtained from the  Naive Bayes with the application of the SMOTE and genetic algorithm is 78.46%.

Downloads

Download data is not yet available.

Article Details

How to Cite
[1]
Afifah Ratna Safitri and Much Aziz Muslim, “Improved Accuracy of Naive Bayes Classifier for Determination of Customer Churn Uses SMOTE and Genetic Algorithms”, J. Soft Comput. Explor., vol. 1, no. 1, pp. 70-75, Oct. 2020.
Section
Articles

References

V. Mahajan, R. Misra, R. Mahajan. “Review on factors affecting customer churn in telecom sector”, International Journal of Data Analysis Techniques and Strategies, 9(2), pp. 122-144, 2017.

A. A. Q. Ahmed, D. Maheswari. “Churn prediction on huge telecom data using hybrid firefly based classification”, Egyptian Informatics Journal, 18(3), pp. 215-220, 2017.

R. Hejazinia, M. Kazemi. “Prioritizing Factors influencing customer churn”, Interdisciplinary Journal of Contemporary Research in Business, 5(12), pp. 227-236, 2014.

P. K. Banda, S. Tembo. “Application of System Dynamics to Mobile Telecommunication Customer Churn Management”, Journal of Telecommunication, Electronic and Computer Engineering, 9(3), pp. 67-76, 2017.

H. S. Soliman. “Customer Relationship Management and Its Relationship to the Marketing Performance”, International Journal of Business and Social Science, 2(10), pp. 166-182, 2011.

I. Brandusoiu, G. Toderean. “Churn prediction in the telecommunications sector using support vector machines”, Annals of the Oradea University, 22(1), pp. 19-22, 2013.

M. A. Muslim, A. J. Herowati, E. Sugiharti, B. Prasetiyo. “Application of the pessimistic pruning to increase the accuracy of C4.5 algorithm in diagnosing chronic kidney disease”, Journal of Physics: Conf. Series, 983, pp. 1-9, 2017.

M. A. Muslim, S. H. Rukmana, E. Sugiharti, B. Prasetiyo, S. Alimah “Optimization of C4.5 algorithm-based particle swarm optimization for breast cancer diagnosis”, Journal of Physics: Conf. Series, 983, pp.1-7, 2017.

P. Sinha, P. Sinha. “Comparative Study of Chronic Kidney Disease Prediction using KNN and SVM”, International Journal of Engineering Research & Technology (IJERT), 4(12), pp. 608-612, 2015.

Makhtar, S. Nafis, M. A. Mohamed, M. K. Awang, M. N. A. Rahman, M. M. Deris. “Churn Classification Model for Local Telecommunication Company Based on Rough Set Theory”, Journal of Fundamental and Applied Sciences, 9(6S), pp. 854-868, 2017.

H. Lee, J. Kim, S. Kim. “Gaussian-Based SMOTE Algorithm for Solving Skewed Class Distributions”, International Journal of Fuzzy Logic and Intelligent Systems, 17(4), pp. 229-234, 2017.

M. H. A. Elhebir, A. Abraham. “A Novel Ensemble Approach to Enhance the Performance of Web Server Logs Classification”, International Journal of Computer Information Systems and Industrial Management Applications, 7, pp. 189-195, 2015.

M. Anis, M. Ali. “Investigating the Performance of Smote for Class Imbalanced Learning: A Case Study of Credit Scoring Datasets”, European Scientific Journal, 13(33), pp. 341-353, 2017.

L.Marlina, M. A. Muslim, A. P. U. Siahaan, “Data Mining Classification Comparison (Naive Bayes and C4.5 Algorithms)”, International Journal of Engineering Trends and Technology (IJETT), 38(7), pp. 382-383, 2016.

C. Kirui, L. Hong, W. Cheruiyot, H. Kirui. “Predicting Customer Churn in Mobile Telephony Industry Using Probabilistic Classifiers in Data Mining”, International Journal of Computer Science Issues, 10(1), pp. 165-172, 2013.

N. V. Chawla, K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer. “SMOTE: Synthetic Minority Over-sampling Technique”, Journal of Artificial Intelligence Research, 16, pp. 321-357, 2002.

Abstract viewed = 1011 times