Increasing Accuracy of C4.5 Algorithm Using Information Gain Ratio and Adaboost for Classification of Chronic Kidney Disease

Main Article Content

Aprilia Lestari
Alamsyah

Abstract

Data information that has been available is very much and will require a very long time to  process large amounts of information data. Therefore, data mining is used to process large  amounts of data. Data mining methods can be used to classify patient diseases, one of them is  chronic kidney disease. This research used the classification tree method classification with the  C4.5 algorithm. In the pre-processing process, a feature selection was applied to reduce  attributes that did not increase the results of classification accuracy. The feature selection used  the gain ratio. The Ensemble method used adaboost, which well known as boosting. The  datasets used by Chronic Kidney Dataset (CKD) were obtained from the UCI repository of  learning machine. The purpose of this research was applying the information gain ratio and  adaboost ensemble to the chronic kidney disease dataset using the C4.5 algorithm and finding  out the results of the accuracy of the C4.5 algorithm based on information gain ratio and  adaboost ensemble. The results obtained for the default iteration in adaboost which was 50  iterations. The accuracy of C4.5 stand-alone was obtained 96.66%. The accuracy for C4.5 using  information gain ratio was obtained 97.5%, while C4.5 method using information gain ratio and  adaboost was obtained 98.33%.

Downloads

Download data is not yet available.

Article Details

How to Cite
[1]
Aprilia Lestari and Alamsyah, “Increasing Accuracy of C4.5 Algorithm Using Information Gain Ratio and Adaboost for Classification of Chronic Kidney Disease”, J. Soft Comput. Explor., vol. 1, no. 1, pp. 32-38, Oct. 2020.
Section
Articles

References

Boukenze, B., Mousannif, H., & Haqiq, A. (2016). Performance of Data Mining Techniques to Predict in Healthcare Case Study: Chronic Kidney Failure Disease. International Journal of Database Management Systems (IJDMS), 8(3).

Shajahaan, S. S., Shanthi, S., & ManoChitra, V. (2013). Application Data mining Techniques to Model Breast Cancer Data. International Journal of Emerging Technology and Advanced Engineering, 3(11): 362-369.

Pranatha, A. A. (2012). Analisis Perbandingan Lima Metode Klasifikasi pada Dataset Sensus Penduduk. Jurnal Sistem Informasi, 4(2): 127-134.

Neeraj, B., Girja, S., Ritu, D. B., & Manisha, M. (2013). Decision Tree Analysis on J48 Algorithm for Data mining. International Journal of Advanced Research in Computer Science and Software Engineering (JARCSSE), 3(6): 1114-1119.

Muzakir, A., & Wulandari, R. A. (2016). Model Data mining sebagai Prediksi Penyakit Hipertensi Kehamilan dengan Teknik Decision Tree. Scientific Journal of Informatics, 3(1): 19-26.

Muslim, M. A., Rukmana, S. H., Sugiharti, E., Prasetiyo, B., & Alimah, S. (2018). Optimization of C4.5 Algorithm-Based Particle Swarm Optimization for Breast Cancer Diagnosis. International Conference on Mathematics, Science and Education, 983(1): 012-063.

Padmanaban, K. A & Parthiban, G. (2016). Applying Machine Learning Techniques for Predicting the Risk of Chronic Kidney Disease. Indian Journal of Science and Technology, 4(2): 1-5.

S, T., Bai, M., & Majumdar, J. (2017). Analysis and Prediction of Chronic Kidney Disease Using Data Mining Techniques. International Journal of Engineering Research in Computer Science and Engineering (IJERCSE), 4(9): 25-32.

Gola, J., Britz, D., Staudt, T., Winter, M., Schneider, A. S., Ludovici, M., & Mucklich, F. (2018). Advanced microstructure classification by Data mining methods. Computational Materials Science, 148: 324-335.

Nurzahputra, A., Safitri, A. R., & Muslim, M. A. (2017). Klasifikasi Pelanggan pada Customer Churn Prediction Menggunakan Decision Tree. Prosiding Seminar Nasional Matematika. Semarang: Universitas Negeri Semarang: 717- 722.

Rodriguez-Galiano, V. F., Luque-Espinar, J. A., Chica-Olmo, M., & Mendes, M. P. (2018). Feature Selection Approaches for Predictive Modelling of Groundwater Nitrate Pollution: An Evaluation of Filters, Embedded and Wrapper Methods. Science of the Total Environment, 624(2018): 661-672.

Prasetyo, E. (2014). Data mining: Konsep dan Aplikasi Menggunakan Matlab. Yogyakarta: Andi Offset.

Kusrini, & Luthfi, E. T. (2009). Algoritma Data Mining. Yogyakarta: CV Andi Offset.

Neeraj, B., Girja, S., Ritu, D. B., & Manisha, M. (2013). Decision Tree Analysis on J48 Algorithm for Data mining. International Journal of Advanced Research in Computer Science and Software Engineering (JARCSSE), 3(6): 1114-1119.

Quinland, J. Ross. (1986). Introduction of Decision Tree. Machine Learning. 1(1): 81-106

Han, J. (2012). Data mining Concepts and Techniques. San francisco: Morgan Kauffman.

Listiana, E., & Muslim, M. A. (2017). Penerapan Adaboost Untuk Klasifikasi Support Vector Machine Guna Mengingkatkan Akurasi Pada Diagnosa Chronic Kidney Disease. Prosiding Seminar Nasional Teknologi dan Informatika, 875- 881.

Nurzahputra, A., & Muslim, M. A. (2017). Peningkatan Akurasi pada Algoritma C4.5 Menggunakan Adaboost untuk Meminimalkan Resiko Kredit. Prosiding Seminar Nasional Teknologi dan Informatika. Kudus: Universitas Muria Kudus: 243-247

Abstract viewed = 676 times