Improving Algorithm Accuracy K-Nearest Neighbor Using Z-Score Normalization and Particle Swarm Optimization to Predict Customer Churn

Main Article Content

Muhammad Ali Imron
Budi Prasetyo

Abstract

Due to increased competition in the business world, many companies use data mining techniques to determine the loyalty level of customers. In this business, data mining can be used to determine the loyalty level of customers. Data mining consists of several research models, one of which is classification. One of the most commonly used methods in classification is the K-Nearest Neighbor algorithm. In this study, the data which used are from German Credit Datasets obtained from UCI machine learning repository. The purpose of this study is to find out how Z-Score works to normalize the data and Particle Swarm Optimization to find the most optimal K value parameters, so the performance of the K-Nearest Neighbor algorithm is more optimal during the classification.  The methods which were used to normalize the data are Z-score and Particle Swarm Optimization to determine the most optimal K value. The classification was tested using confusion matrix to determine the generated accuracy. From the finding of this study, the application of Z-score normalization and Particle Swarm Optimization with the K Nearest Neighbor algorithm succeed in increasing the accuracy up to 14%. The initial accuracy was 68.5%, and after applying the normalization of Z-Score and Particle Swarm Optimization, the accuracy became 82.5%.

Downloads

Download data is not yet available.

Article Details

How to Cite
[1]
Muhammad Ali Imron and Budi Prasetyo, “Improving Algorithm Accuracy K-Nearest Neighbor Using Z-Score Normalization and Particle Swarm Optimization to Predict Customer Churn”, J. Soft Comput. Explor., vol. 1, no. 1, pp. 56-62, Oct. 2020.
Section
Articles

References

J. Han, M. Kamber and J. Pei, Data Mining Concepts and Techniques Third Edition. USA: Elsivier. 2012

Y. Liu, and Y. Zhuang, “Research model of churn prediction based on customer segmentation and misclassification cost in the context of big data, ” J. of Comp. & Comm. vol. 03, pp. 87-93, 2015.

Y. Huang and T. Kechadi, “ An effective hybrid learning system for telecommunication churn prediction, ” Exp. Sys. with Appl. Vol. 40, pp. 5635-5647, 2013

I. Brandusoiu, and G. Toderean, “Churn Prediction in the Telecommunications Sector using Support Vector Machines, ”Ann. of the Oradea University, vol. 22, no. 1, pp. 19–22, 2013.

E. Sugiharti, S. Firmansyah, and F. R. Devi, “Predictive Evaluation of Performance of Computer Science Students of Unnes Using Data Mining Based on Naïve Bayes Classifier (NBC) Algorithm, ” J. of Theoretical & Appl. Info. Tech.. , vol. 95, no. 4, pp. 902–911, 2017.

P. Plawiak, M. Abdar, and U. R. Acharya, “Application of new deep genetic cascade ensemble of SVM classifiers to predict the Australian credit scoring, ” App. Soft. Compt., vol. 84, pp. 105740, 2019.

C. Ordonez, S. Maabout, D. S. Matusevich, and W. Cabrera, “Extending ER models to capture database transformations to build data sets for data mining, ” Data & Know. Eng. vol. 89, pp. 38-54, 2014.

X. Zhong, and D. Enke, “A comprehensive cluster and classification mining procedure for daily stock market return forecasting, ” Neurocomputing, vol. 267, pp. 152-168, 2017.

C. Saranya, and Munikandan, “A Study on Normalization Techniques for

Privacy Preserving Data Mining, ” Int. J. of Eng. & Tech., vol. 5, no. 3, pp. 2701, 2013

H. Goyal, Sandeep, Venu, R. Pokuri, S. Kathula S and N. Battula, “ Normalization of Data in Data Mining, ” Int. J. of Soft. & Web Science, vol. 10, no. 1, pp. 32-33. 2014.

L. A. Ashari, M. A. Muslim, and Alamsyah, “Comparison Performance of Genetic Algorithm and Ant Colony Optimization in Course Scheduling Optimizing, ” Sci. J. of Info. vol. 3, no. 2, pp. 149-158, 2016.

M. A. Muslim, S. H. Rukmana, E. Sugiharti, B. Prasetiyo, and S. Alimah, “ Optimization of C4.5 Algorithm-based Particle Swarm Optimization for Breast Cancer Diagnosis ” J. of Physic, vol. 983, no. 1, pp. 1-5, 2017

S. W. Fei, M. J. Wang, Y. B. Miao, J. Tu, and C. L. Liu, “Particle Swarm Optimization-based Support Vector Machine for Forecasting Dissolved Gases Content in Power Transformer Oil, ” Energy Conversion and Manag., vol. 50, no. 6. pp. 1604-1609, 2009.

A. Pandey, and A. Jain, “Comparative Analysis of KNN Algorithm using Various Normalization Techniques ” Int. J. of Comp. Net. & Infor Sec. vol. 11, no. 04, pp. 36-42, 2017.

Sumathi, S., & Surekha, P. Computational Intelligence Paradigms: Theory and Applications Using Matlab (1st ed.). Boca Raton: CRC Press. 2009

M. R. Hidayah, I. Akhlis, and E. Sugiharti, “Recognition Number of The Vehicle Plate Using Otsu Method and K-Nearest Neighbour Classification , ” Scientific Journal of Informatics, vol. 4, no. 1, pp. 66-74. 2017.

Abstract viewed = 1338 times