K-Means-Based Pseudo-Labeling Technique in Supervised Learning Models for Regional Classification Based on Types of Non-Communicable Diseases
DOI:
https://doi.org/10.15575/join.v10i2.1609Keywords:
K-means, Non-communicable diseases, Pseudo-labeling, Regional classification, semi-supervised learningAbstract
References
[1] Kesehatan Kementerian, “Laporan Kinerja Instansi Pemerintah Kementerian Kesehatan RI untuk Tahun Anggaran 2021,” Jakarta, Feb. 2022. Accessed: Oct. 18, 2024. [Online]. Available: https://ppid.kemkes.go.id/wp-content/uploads/2022/06/lakip_2022.pdf
[2] H. Arifin et al., “Analysis of Modifiable, Non-Modifiable, and Physiological Risk Factors of Non-Communicable Diseases in Indonesia: Evidence from the 2018 Indonesian Basic Health Research,” J Multidiscip Healthc, vol. Volume 15, pp. 2203–2221, Sep. 2022, doi: 10.2147/JMDH.S382191.
[3] Indriani and V. Fatmawati, “The Identification of Non-Communicable Diseases (NCDS) Risk Factors in Yogyakarta, Indonesia,” 2023, pp. 165–174. doi: 10.2991/978-94-6463-190-6_21.
[4] World Health Organization, “NONCOMMUNICABLE DISEASES COUNTRY PROFILES 2018.” Accessed: Oct. 18, 2023. [Online]. Available: https://www.who.int/docs/default-source/ncds/9789241514620-eng.pdf?sfvrsn=48f7a45c_2
[5] A. Afif, “Analisis Cluster Ward Pada Pengelompokan Wilayah Puskesmas Di Kota Kediri Berdasarkan Penyakit Tidak Menular,” Unisda Journal of Mathematics and Computer Science (UJMC), vol. 8, no. 2, pp. 39–44, Dec. 2022, doi: 10.52166/ujmc.v8i2.3567.
[6] R. Ferdousi, M. A. Hossain, and A. El Saddik, “Early-Stage Risk Prediction of Non-Communicable Disease Using Machine Learning in Health CPS,” IEEE Access, vol. 9, pp. 96823–96837, 2021, doi: 10.1109/ACCESS.2021.3094063.
[7] C. Wu, T. Zhou, Y. Tian, J. Wu, J. Li, and Z. Liu, “A method for the early prediction of chronic diseases based on short sequential medical data,” Artif Intell Med, vol. 127, p. 102262, May 2022, doi: 10.1016/j.artmed.2022.102262.
[8] B. Legetic, A. Medici, M. Hernández-Avila, G. Alleyne, and A. Hennis, DISEASE CONTROL PRIORITIES • THIRD EDITION Economic Dimensions of Noncommunicable Diseases in Latin America and the Caribbean. 2016. Accessed: Nov. 06, 2025. [Online]. Available: www.paho.org/permissions
[9] A. Taher et al., “Comprehensive Efforts to Accelerate Non-Communicable Disease Services in the Era of COVID-19 in Indonesia’s Suburban Area,” ASEAN Journal of Community Engagement, vol. 6, no. 1, pp. 152–68, Jul. 2022, doi: 10.7454/ajce.v6i1.1167.
[10] A. Budreviciute et al., “Management and Prevention Strategies for Non-communicable Diseases (NCDs) and Their Risk Factors,” Front Public Health, vol. 8, Nov. 2020, doi: 10.3389/fpubh.2020.574111.
[11] L. Handayani and L. Kristiana, “Faktor-Faktor Yang Memengaruhi Keterjangkauan Pelayanan Kesehatan Di Puskesmas Daerah Terpencil Perbatasan Di Kabupaten Sambas (Studi Kasus di Puskesmas Sajingan Besar)”, Accessed: Nov. 06, 2025. [Online]. Available: https://media.neliti.com/media/publications-test/21346-faktor-faktor-yang-memengaruhi-keterjang-cdf92541.pdf
[12] L. C. S. Edmund, C. K. Ramaiah, and S. P. Gulla, “Electronic Medical Records Management Systems: An Overview,” DESIDOC Journal of Library & Information Technology, vol. 29, no. 6, pp. 3–12, Nov. 2009, doi: 10.14429/djlit.29.273.
[13] N. N. Basil, S. Ambe, C. Ekhator, and E. Fonkem, “Health Records Database and Inherent Security Concerns: A Review of the Literature,” Cureus, Oct. 2022, doi: 10.7759/cureus.30168.
[14] I. Silva, D. Ferreira, H. Peixoto, and J. Machado, “A Data Acquisition and Consolidation System based on openEHR applied to Physical Medicine and Rehabilitation,” Procedia Comput Sci, vol. 220, pp. 844–849, 2023, doi: 10.1016/j.procs.2023.03.113.
[15] C. A. S. Andrade et al., “Inequalities in the burden of non-communicable diseases across European countries: a systematic analysis of the Global Burden of Disease 2019 study,” Int J Equity Health, vol. 22, no. 1, p. 140, Jul. 2023, doi: 10.1186/s12939-023-01958-8.
[16] S. Pengpid and K. Peltzer, “Trends in behavioral and biological risk factors for non-communicable diseases among adults in Bhutan: results from cross-sectional surveys in 2007, 2014, and 2019,” Front Public Health, vol. 11, Aug. 2023, doi: 10.3389/fpubh.2023.1192183.
[17] R. A. Roomaney, B. van Wyk, A. Cois, and V. Pillay-van Wyk, “Inequity in the Distribution of Non-Communicable Disease Multimorbidity in Adults in South Africa: An Analysis of Prevalence and Patterns,” Int J Public Health, vol. 67, Aug. 2022, doi: 10.3389/ijph.2022.1605072.
[18] J. Shu and W. Jin, “Prioritizing non-communicable diseases in the post-pandemic era based on a comprehensive analysis of the GBD 2019 from 1990 to 2019,” Sci Rep, vol. 13, no. 1, p. 13325, Aug. 2023, doi: 10.1038/s41598-023-40595-7.
[19] A. Mohammed, “The effects of COVID-19 on Non-Communicable Disease : A Case Study of Six Countries (COVID-19 Situational Analysis Project)”.
[20] T. T. Alamnia, G. M. Sargent, and M. Kelly, “Patterns of Non-Communicable Disease, Multimorbidity, and Population Awareness in Bahir Dar, Northwest Ethiopia: A Cross-Sectional Study,” Int J Gen Med, vol. Volume 16, pp. 3013–3031, Jul. 2023, doi: 10.2147/IJGM.S421749.
[21] X.-F. Pan, J. Yang, Y. Wen, N. Li, S. Chen, and A. Pan, “Non-Communicable Diseases During the COVID-19 Pandemic and Beyond,” Engineering, vol. 7, no. 7, pp. 899–902, Jul. 2021, doi: 10.1016/j.eng.2021.02.013.
[22] Q. Zeng et al., “The Epidemiological Characteristics of Noncommunicable Diseases and Malignant Tumors in Guiyang, China: Cross-sectional Study,” JMIR Public Health Surveill, vol. 8, no. 10, p. e36523, Oct. 2022, doi: 10.2196/36523.
[23] W. Peng et al., “Trends in major non-communicable diseases and related risk factors in China 2002–2019: an analysis of nationally representative survey data,” Lancet Reg Health West Pac, p. 100809, Jun. 2023, doi: 10.1016/j.lanwpc.2023.100809.
[24] G. R. Menon, J. Yadav, and D. John, “Burden of non-communicable diseases and its associated economic costs in India,” Social Sciences & Humanities Open, vol. 5, no. 1, p. 100256, 2022, doi: 10.1016/j.ssaho.2022.100256.
[25] A. K. Yadav, K. R. Paltasingh, and P. K. Jena, “Incidence of Communicable and Non-communicable Diseases in India: Trends, Distributional Pattern and Determinants,” The Indian Economic Journal, vol. 68, no. 4, pp. 593–609, Dec. 2020, doi: 10.1177/0019466221998841.
[26] S. Nomura, H. Sakamoto, C. Ghaznavi, and M. Inoue, “Toward a third term of Health Japan 21 – implications from the rise in non-communicable disease burden and highly preventable risk factors,” Lancet Reg Health West Pac, vol. 21, p. 100377, Apr. 2022, doi: 10.1016/j.lanwpc.2021.100377.
[27] F. Mbonyinshuti, J. Nkurunziza, J. Niyobuhungiro, and E. Kayitare, “Application of random forest model to predict the demand of essential medicines for noncommunicable diseases management in public health facilities,” Pan African Medical Journal, vol. 42, 2022, doi: 10.11604/pamj.2022.42.89.33833.
[28] A. S. Abdalrada, J. Abawajy, T. Al-Quraishi, and S. M. S. Islam, “Machine learning models for prediction of co-occurrence of diabetes and cardiovascular diseases: a retrospective cohort study,” J Diabetes Metab Disord, vol. 21, no. 1, pp. 251–261, Jan. 2022, doi: 10.1007/s40200-021-00968-z.
[29] Q. Liu et al., “Predicting the Risk of Incident Type 2 Diabetes Mellitus in Chinese Elderly Using Machine Learning Techniques,” J Pers Med, vol. 12, no. 6, p. 905, May 2022, doi: 10.3390/jpm12060905.
[30] D. A. Debal and T. M. Sitote, “Chronic kidney disease prediction using machine learning techniques,” J Big Data, vol. 9, no. 1, p. 109, Nov. 2022, doi: 10.1186/s40537-022-00657-5.
[31] N. Shi et al., “Predicting the Need for Therapeutic Intervention and Mortality in Acute Pancreatitis: A Two-Center International Study Using Machine Learning,” J Pers Med, vol. 12, no. 4, p. 616, Apr. 2022, doi: 10.3390/jpm12040616.
[32] J. Zhang, R. Han, G. Shao, B. Lv, and K. Sun, “Artificial Intelligence in Cardiovascular Atherosclerosis Imaging,” J Pers Med, vol. 12, no. 3, p. 420, Mar. 2022, doi: 10.3390/jpm12030420.
[33] K. Al Sadi and W. Balachandran, “Prediction Model of Type 2 Diabetes Mellitus for Oman Prediabetes Patients Using Artificial Neural Network and Six Machine Learning Classifiers,” Applied Sciences, vol. 13, no. 4, p. 2344, Feb. 2023, doi: 10.3390/app13042344.
[34] G. Özsezer and G. Mermer, “Diabetes Risk Prediction with Machine Learning Models,” Artificial Intelligence Theory and Applications, vol. 2, no. 2, pp. 1–9, 2022.
[35] O. A. Ebrahim and G. Derbew, “Application of supervised machine learning algorithms for classification and prediction of type-2 diabetes disease status in Afar regional state, Northeastern Ethiopia 2021,” Sci Rep, vol. 13, no. 1, p. 7779, May 2023, doi: 10.1038/s41598-023-34906-1.
[36] J. J. Boutilier, T. C. Y. Chan, M. Ranjan, and S. Deo, “Risk Stratification for Early Detection of Diabetes and Hypertension in Resource-Limited Settings: Machine Learning Analysis,” J Med Internet Res, vol. 23, no. 1, p. e20123, Jan. 2021, doi: 10.2196/20123.
[37] Y. C A Padmanabha Reddy, P. Viswanath, and B. Eswara Reddy, “Semi-supervised learning: a brief review,” International Journal of Engineering & Technology, vol. 7, no. 1.8, p. 81, Feb. 2018, doi: 10.14419/ijet.v7i1.8.9977.
[38] M. F. A. Hady and F. Schwenker, “Semi-supervised Learning,” 2013, pp. 215–239. doi: 10.1007/978-3-642-36657-4_7.
[39] Y. Wang, X. Gu, W. Hou, M. Zhao, L. Sun, and C. Guo, “Dual Semi-Supervised Learning for Classification of Alzheimer’s Disease and Mild Cognitive Impairment Based on Neuropsychological Data,” Brain Sci, vol. 13, no. 2, Feb. 2023, doi: 10.3390/brainsci13020306.
[40] M. U. Alam and R. Rahmani, “Federated Semi-Supervised Multi-Task Learning to Detect COVID-19 and Lungs Segmentation Marking Using Chest Radiography Images and Raspberry Pi Devices: An Internet of Medical Things Application,” Sensors, vol. 21, no. 15, p. 5025, Jul. 2021, doi: 10.3390/s21155025.
[41] Y. Zhang, L. Su, Z. Liu, W. Tan, Y. Jiang, and C. Cheng, “A semi-supervised learning approach for COVID-19 detection from chest CT scans,” Neurocomputing, vol. 503, pp. 314–324, Sep. 2022, doi: 10.1016/j.neucom.2022.06.076.
[42] C. H. Han, M. Kim, and J. T. Kwak, “Semi-supervised learning for an improved diagnosis of COVID-19 in CT images,” PLoS One, vol. 16, no. 4, p. e0249450, Apr. 2021, doi: 10.1371/journal.pone.0249450.
[43] Z. Huang, G. Long, B. Wessler, and M. C. Hughes, “A New Semi-supervised Learning Benchmark for Classifying View and Diagnosing Aortic Stenosis from Echocardiograms,” 2021. [Online]. Available: https://github.com/tufts-ml/ssl-for-echocardiograms
[44] H. Wu, J. Sun, and Q. You, “Semi-Supervised Learning for Medical Image Classification Based on Anti-Curriculum Learning,” Mathematics, vol. 11, no. 6, p. 1306, Mar. 2023, doi: 10.3390/math11061306.
[45] S. Lim, J. Park, M. Lee, and H. Lee, “Unsupervised object discovery with pseudo label generated using K-means and self-supervised transformer,” Neurocomputing, vol. 545, p. 126326, Aug. 2023, doi: 10.1016/j.neucom.2023.126326.
[46] L. Chen et al., “Making Your First Choice: To Address Cold Start Problem in Medical Active Learning,” 2023. [Online]. Available: https://github.com/cliangyu/CSVAL.
[47] F. H. Awad, M. M. Hamad, and L. Alzubaidi, “Robust Classification and Detection of Big Medical Data Using Advanced Parallel K-Means Clustering, YOLOv4, and Logistic Regression,” Life, vol. 13, no. 3, p. 691, Mar. 2023, doi: 10.3390/life13030691.
[48] K. Liu, X. Ning, and S. Liu, “Medical Image Classification Based on Semi-Supervised Generative Adversarial Network and Pseudo-Labelling,” Sensors, vol. 22, no. 24, p. 9967, Dec. 2022, doi: 10.3390/s22249967.
[49] S. M. Miraftabzadeh, C. G. Colombo, M. Longo, and F. Foiadelli, “K-Means and Alternative Clustering Methods in Modern Power Systems,” IEEE Access, vol. 11, pp. 119596–119633, 2023, doi: 10.1109/ACCESS.2023.3327640.
Downloads
Published
Issue
Section
Citation Check
License
Copyright (c) 2025 Herison Surbakti, Tb Ai Munandar

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
-
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
-
NoDerivatives — If you remix, transform, or build upon the material, you may not distribute the modified material.
-
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
- You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation.
- No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License







