The Implementation of Z-Score Normalization and Boosting Techniques to Increase Accuracy of C4.5 Algorithm in Diagnosing Chronic Kidney Disease
Main Article Content
Abstract
In the health sector, data mining can be used as a recommendation to predict a disease from the collection of patient medical record data or health data. One of the techniques can be applied is classification with the C4.5 algorithm. The increasing accuracy can be conducted in data transformation using zscore normalization method. In addition, the implementation of the ensemble method can also improve accuracy of C4.5 algorithm, namely boosting or adaboost. The purpose of this study was determinin the implementation of zscore normalization in the pre-processing and adaboost stages of the C4.5 algorithm and determing the accuracy of the C4.5 algorithm after applying zscore and adaboost normalization in diagnosing chronic kidney disease. In this study, the mining process used k-fold cross validation with the default value k = 10. The implementation of the C4.5 algorithm obtained an accuracy of 96% while the accuracy of the C4.5 algorithm with the zscore normalization method obtained an accuracy of 96.75%. The highest accuracy was obtained from the addition of the boosting method to the C4.5 algorithm and zscore normalization obtained the accuracy of 97.25%. The increasing accuracy was obtained of 1.25% which compared to the accuracy C4.5 algorithm.
Downloads
Article Details
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
Sugiharti, E. & Muslim, M.A. (2016). On-line Clustering of Lecturers Performance of Computer Science Department of Semarang State University Using K-Means Algorithm. Journal of Theoretical and Applied Information Technology, 83(1): 64-71.
Tamilselvi, R., Sivasakthi, B., & Kavitha, R. (2015). An Efficient Preprocessing and Postprocessing Techniques in Data Mining. International Journal of Research in Computer Applications and Robotics,3(4): 80-85.
Saranya, C., & Manikandan, G. (2013). A Study on Normalization Techniques for Privacy Preserving Data Mining. International Journal of Engineering and Technology (IJET), 5(3): 2701-2704.
Goyal, H., Sandeep, Venu, Pokuri, R., Kathula, S., Battula, N. (2014). Normalization of Data in Data Mining. International Journal of Software and Web Science (IJSWS). 32-33.
Han, J., Kamber, M. & Pei, J. (2011). Data Mining Concepts and Techniques, 3rd ed. USA: Morgan Kaufmann Publisher.
Muslim, M.A., Herowati, A.J., Sugiharti, E., & Presetiyo, B. (2018). Application of the pessimistic pruning to increase the accuracy of C4.5 algorithm in diagnosing chronic kidney disease. Journal of Physics: Conference Series, 983(1).
Bala, S. & Kumar, K. (2014). A Literature Review on Kidney Disease Prediction using Data Mining Classification Techniques. International Journal of Computer Science and Mobile Computing, 3(7): 960-967.
Sujatha, R. & Ezhilmaran. (2016). Performance Analysis of Data Mining Classification Techniques for Chronic Kidney Disease. International Journal of Pharmacy & Technology, 8(2): 13032-13037.
Celik, E., Atalay, M., & Kondiloglu, A. (2016). The Diagnosis and Estimate of Chronic Kidney Disease Using the Machine Learning Methods. International Journal of Intelligent Systems and Applications in Enggineering, 4(1): 27-31.
Chary, N., & Rama, B. (2017). A Survey on Comparative Analysis of Decision Tree Algorithms in Data Mining. International Journal of Advanced Scientific Technologies, Engineering and Management Sciences (IJASTEMS), 3(1): 91-95.
Handarko, J.L. & Alamsyah. (2015). Implementasi Fuzzy Decision Tree untuk Mendiagnosa Penyakit Hepatitis. Unnes Hournal of Mathematic, 4(2): 1-9.
Mishra, A.K., Choudhary, A., & Choundhary, S. (2016). Normalization and Transformation Technique Based Efficient Privacy Preservation In Data Mining. International Journal of Modern Engineering and Research Technology, 3(2): 5- 10.
Muzakir, A., & Wulandari, R.A. (2016). Model Data Mining sebagai Prediksi Penyakit Hipertensi Kehamilan dengan Teknik Decision Tree. Scientific Journal of Informatics, 3(1): 19-26.
Sampurno, G.I., Sugiharti, E., & Alamsyah, A. (2018). Comparison of Dynamic Programming Algorithm and Greedy Algorithm on Integer Knapsack Problem in Freight Transportation. Scientific Journal of Informatics, 5(1): 49.
Dai, W., & Ji, W. (2014). A MapReduce Implementation of C4.5 Decision Tree Algorithm. International Journal of Database Theory and Application, 7(1): 49- 60.
Muslim, M.A., Rukmana, S.H., Sugiharti, E., Prasetiyo, B., & Alimah, S. (2018). Optimization of C4.5 algorithm-based particle swarm optimization for breast cancer diagnosis. Journal of Physics: Conference Series, 983(1).
Korada, N.K., Kumar, N.S.P., & Deekshitulu, Y.V.N.H. (2012). Implementation of Naïve Bayesian Classifier and Ada-Boost Algorithm Using Maize Expert System. International Journal of Information Sciences and Techniques (IJIST), 2(3): 63-75.