PENANGANAN DATA MISSING VALUE PADA KUALITAS PRODUKSI JAGUNG DENGAN MENGGUNAKAN METODE K-NN IMPUTATION PADA ALGORITMA C4.5

  • Moch. Lutfi Universitas Yudharta Pasuruan
  • Mochamad Hasyim Universitas Yudharta Pasuruan
Keywords: Data Mining, K-NN imputation, C 4.5, Quality of Corn Production, Missing Value

Abstract

Corn is a staple crop for Indonesian people because most of his life is from the agriculture sector. To increase the productivity of corn, another thing to be aware of is looking at the quality of the corn products. Through empirical observation and observation, research explores and extracts data through the concept of data mining so that neglected data becomes useful. Thus determining the quality of corn production is an important task to help the farmers in determining the classification process. Missing value is a problem in maintaining a quality data. Missing value can be caused by several things, one of which is caused by an error at the time of data entry. Missing value will be a problem when the amount of data in large quantities, so it is very influential in the survey results. Therefore on this research proposed K-NN imputation method to handle missing value data. The results showed the accuracy of the C 4.5 algorithm classification process on the corn production dataset that experienced a missing value accuracy value of 92.90%. Whereas if done with special handling using the method K-NN imputation on the handling process missing value best value at k = 5 of 94.50% with this that the proposed method increases significantly.

Downloads

Download data is not yet available.

References

M. A. Bustomi and Z. Dzulfikar, “Analisis Distribusi Intensitas RGB Citra Digital untuk Klasifikasi Kualitas Biji Jagung menggunakan Jaringan Syaraf Tiruan,” Fis. Dan Apl., vol. 10, no. 3, pp. 127–132, 2014.

L. Rokach and O. Maimon, Data Mining With Decision Trees - Theory and Applications. 2015.

T. Wang, Z. Qin, Z. Jin, and S. Zhang, “Handling over-fitting in test cost-sensitive decision tree learning by feature selection, smoothing and pruning,” Journal of Systems and Software, vol. 83, no. 7. pp. 1137–1147, 2010.

M. Malarvizhi and A. Thanamani, “K-NN Classifier Performs Better Than K-Means Clustering in Missing Value Imputation,” IOSR J. Comput. Eng., vol. 6, no. 5, pp. 12–15, 2012.

G. E. A. P. A. Batista and M. C. Monard, “A study of k-nearest neighbour as an imputation method,” Front. Artif. Intell. Appl., vol. 87, pp. 251–260, 2002.

E. S. Rahayu, R. Satria, and C. Supriyanto, “Penerapan Metode Average Gain , Threshold Pruning dan Cost Complexity Pruning untuk Split Atribut pada Algoritma C4 . 5,” J. Intell. Syst., vol. 1, no. 2, pp. 91–97, 2015.

C. J. Mantas and J. Abellán, “Credal-C4.5: Decision tree based on imprecise probabilities to classify noisy data,” Expert Syst. Appl., vol. 41, no. 10, pp. 4625–4637, 2014.

Q. Song, M. Shepperd, X. Chen, and J. Liu, “Can K-NN imputation improve the performance of C4.5 with small software project data sets? A comparative evaluation,” J. Syst. Softw., vol. 81, no. 12, pp. 2361–2370, 2008.

D. T. Larose, Discovering Knowledge in Data an introduction to data mining. 2005.

E. Acuña and C. Rodriguez, “The Treatment of Missing Values and its Effect on Classifier Accuracy,” Classif. Clust. Data Min. Appl., no. 1995, pp. 639–647, 2004.

M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,” Inf. Process. Manag., vol. 45, no. 4, pp. 427–437, 2009.

Published
2019-10-28
How to Cite
Moch. Lutfi, & Mochamad Hasyim. (2019). PENANGANAN DATA MISSING VALUE PADA KUALITAS PRODUKSI JAGUNG DENGAN MENGGUNAKAN METODE K-NN IMPUTATION PADA ALGORITMA C4.5. Jurnal RESISTOR (Rekayasa Sistem Komputer), 2(2), 89-104. https://doi.org/10.31598/jurnalresistor.v2i2.427
Abstract viewed = 61 times
FULL TEXT downloaded = 70 times