Performance Comparative Study of Machine Learning Classification Algorithms for Food Insecurity Experience by Households in West Java
DOI:
https://doi.org/10.15575/join.v9i1.1012Keywords:
Extremely randomized tree, Food insecurity, Gradient boosting, Random forest, Rotation forestAbstract
This study aims to compare the classification performance of the random forest, gradient boosting, rotation forest, and extremely randomized tree methods in classifying the food insecurity experience scale in West Java. The dataset used in this research is based on the Socio-Economic Survey by Statistics Indonesia in 2020. The novelty of this research is comparing the performance of the four methods used, which all are the tree ensemble approaches. In addition, due to the imbalance class problem, the authors also applied three imbalance handling techniques in this study. The results show that the combination of the random-forest algorithm and the random-under sampling technique is the best classifier. This approach has a balanced accuracy value of 65.795%. The best classification method results show that the food insecurity experience scale in West Java can be identified by considering the factors of floor area (house size), the number of depositors, type of floor, health insurance ownership status, and internet access capabilities.
References
W. Xing and Y. Bei, “Medical health big data classification based on KNN classification algorithm,†IEEE Access, vol. 8, pp. 28808–28819, 2019.
T. Lan, H. Hu, C. Jiang, G. Yang, and Z. Zhao, “A comparative study of decision tree, random forest, and convolutional neural network for spread-F identification,†Advances in Space Research, vol. 65, no. 8, pp. 2052–2061, 2020.
A. R. Bagasta, Z. Rustam, J. Pandelaki, and W. A. Nugroho, “Comparison of cubic SVM with Gaussian SVM: classification of infarction for detecting ischemic stroke,†in IOP Conference Series: Materials Science and Engineering, 2019, vol. 546, no. 5, p. 052016.
S. Talukdar, P. Singha, S. Mahato, S. Pal, Y.-A. Liou, and A. Rahman, “Land-use land-cover classification by machine learning classifiers for satellite observations—A review,†Remote Sens (Basel), vol. 12, no. 7, p. 1135, 2020.
N. Chakrabarty, T. Kundu, S. Dandapat, A. Sarkar, and D. K. Kole, “Flight arrival delay prediction using gradient boosting classifier,†in Emerging technologies in data mining and information security, Springer, 2019, pp. 651–659.
Z. Tian, J. Xiao, H. Feng, and Y. Wei, “Credit risk assessment based on gradient boosting decision tree,†Procedia Comput Sci, vol. 174, pp. 150–160, 2020.
M. Juez-Gil, Ã. Arnaiz-González, J. J. RodrÃguez, C. López-Nozal, and C. GarcÃa-Osorio, “Rotation Forest for Big Data,†Information Fusion, vol. 74, 2021, doi: 10.1016/j.inffus.2021.03.007.
M. Anwar, “The Household Food Insecurity Amidst the Covid-19 Pandemic in Indonesia,†JEJAK, vol. 14, no. 2, pp. 244–260, 2021.
L. Breiman, “Random forests,†Mach Learn, vol. 45, no. 1, pp. 5–32, 2001.
M. Maniruzzaman et al., “Accurate diabetes risk stratification using machine learning: role of missing value and outliers,†J Med Syst, vol. 42, no. 5, pp. 1–17, 2018.
M. F. Ijaz, M. Attique, and Y. Son, “Data-driven cervical cancer prediction model with outlier detection and over-sampling methods,†Sensors, vol. 20, no. 10, p. 2809, 2020.
F. Cánovas-GarcÃa, F. Alonso-SarrÃa, F. Gomariz-Castillo, and F. Oñate-Valdivieso, “Modification of the random forest algorithm to avoid statistical dependence problems when classifying remote sensing imagery,†Comput Geosci, vol. 103, pp. 1–11, 2017.
B. Ghimire, J. Rogan, V. R. Galiano, P. Panday, and N. Neeti, “An evaluation of bagging, boosting, and random forests for land-cover classification in Cape Cod, Massachusetts, USA,†GIsci Remote Sens, vol. 49, no. 5, pp. 623–643, 2012.
T. N. Phan, V. Kuch, and L. W. Lehnert, “Land Cover Classification using Google Earth Engine and Random Forest Classifier—The Role of Image Composition,†Remote Sens (Basel), vol. 12, no. 15, p. 2411, 2020.
A. Cutler, D. R. Cutler, and J. R. Stevens, “Random forests,†in Ensemble machine learning, Springer, 2012, pp. 157–175.
J. H. Friedman, “Greedy function approximation: a gradient boosting machine,†Ann Stat, pp. 1189–1232, 2001.
B. A. Tama and K.-H. Rhee, “An in-depth experimental study of anomaly detection using gradient boosted machine,†Neural Comput Appl, vol. 31, no. 4, pp. 955–965, 2019.
J. J. RodrÃguez, L. I. Kuncheva, and C. J. Alonso, “Rotation forest: A New classifier ensemble method,†IEEE Trans Pattern Anal Mach Intell, vol. 28, no. 10, pp. 1619–1630, 2006, doi: 10.1109/TPAMI.2006.211.
C. S. Septeria and L. Wachidah, “Klasifikasi Pasien Diabetes Melitus Tipe 1 dengan Metode Rotation Forest,†Prosiding Statistika, pp. 521–529, 2021.
Pd. Geurts, “Ernst D,†Wehenkel L. Extremely randomized trees. Machine Learning, vol. 63, no. 1, pp. 3–42, 2006.
C. Désir, C. Petitjean, L. Heutte, M. Salaun, and L. Thiberville, “Classification of endomicroscopic images of the lung based on random subwindows and extra-trees,†IEEE Trans Biomed Eng, vol. 59, no. 9, pp. 2677–2683, 2012.
E. K. Ampomah, Z. Qin, and G. Nyame, “Evaluation of tree-based ensemble machine learning models in predicting stock price direction of movement,†Information, vol. 11, no. 6, p. 332, 2020.
G. Alfian et al., “Predicting Breast Cancer from Risk Factors Using SVM and Extra-Trees-Based Feature Selection Method,†Computers, vol. 11, no. 9, p. 136, 2022.
B. T. Pham et al., “Intergration of Rotation Forest and MultiBoost Ensembles with Forest by Penalizing Attributes for Spatial Prediction of Landslide Susceptibility,†2022.
A. Luque, A. Carrasco, A. MartÃn, and A. de Las Heras, “The impact of class imbalance in classification performance metrics based on the binary confusion matrix,†Pattern Recognit, vol. 91, pp. 216–231, 2019.
J. H. Kranzler, R. G. Floyd, N. Benson, B. Zaboski, and L. Thibodaux, “Classification agreement analysis of cross-battery assessment in the identification of specific learning disorders in children and youth,†Int J Sch Educ Psychol, vol. 4, no. 3, pp. 124–136, 2016.
Y. Liu, J. Zhang, C. Gao, J. Qu, and L. Ji, “A sensitivity analysis of attention-gated convolutional neural networks for sentence classification,†arXiv preprint arXiv:1908.06263, 2019.
Downloads
Published
Issue
Section
Citation Check
License
Copyright (c) 2024 JOIN (Jurnal Online Informatika)
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
-
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
-
NoDerivatives — If you remix, transform, or build upon the material, you may not distribute the modified material.
-
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
- You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation.
- No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License