Comparative Performance of Fine-Tuned IndoBERT BASE and LARGE Variants for Emotion Detection in Indonesian Tweets

Sri Winarno; Ika Novita Dewi; Adhitya Nugraha; Fahri Firdausillah; Maulatus Shaffira Fitri; Talitha Olga Ramadhani; Erna Amalia Widhiyanti; Ainur Rahma Miftakhul Rizqi

doi:10.15575/join.v11i1.1704

Authors

Sri Winarno Research Center for Intelligent Distributed Surveillance and Security (IDSS), Universitas Dian Nuswantoro, Indonesia
Ika Novita Dewi Research Center for Intelligent Distributed Surveillance and Security (IDSS), Universitas Dian Nuswantoro, Indonesia
Adhitya Nugraha Research Center for Intelligent Distributed Surveillance and Security (IDSS), Universitas Dian Nuswantoro, Indonesia
Fahri Firdausillah Research Center for Intelligent Distributed Surveillance and Security (IDSS), Universitas Dian Nuswantoro, Indonesia
Maulatus Shaffira Fitri Research Center for Intelligent Distributed Surveillance and Security (IDSS), Universitas Dian Nuswantoro, Indonesia
Talitha Olga Ramadhani Faculty of Computer Science, Universitas Dian Nuswantoro, Semarang, Indonesia
Erna Amalia Widhiyanti Faculty of Computer Science, Universitas Dian Nuswantoro, Semarang, Indonesia
Ainur Rahma Miftakhul Rizqi Faculty of Computer Science, Universitas Dian Nuswantoro, Semarang, Indonesia

DOI:

https://doi.org/10.15575/join.v11i1.1704

Keywords:

Emotion Detection, IndoBERT, Indonesian Text, Optimization, Transformer Model

Abstract

In the digital era, where emotions play a crucial role in shaping human behavior, communication, and decision-making, their expressions are often conveyed through short and informal texts on platforms such as Twitter. This research aims to improve the accuracy of emotion detection in Indonesian text using the IndoBERT-BASE-P2 and IndoBERT-LARGE-P2 transformer models. The dataset consists of 7,080 tweets annotated with six basic emotion categories (anger, fear, joy, love, neutral, and sad). The research methodology included text preprocessing, class balancing using SMOTE, and fine-tuning with optimized training parameters. Evaluation results show that IndoBERT-BASE-P2 achieved an accuracy of 84.43% and a macro F1-score of 84.33%, surpassing previous studies, while the larger IndoBERT-LARGE-P2 model tended to overfit and offered no meaningful improvement. Error analysis showed the neutral class was the most difficult to classify. These findings demonstrate that with effective preprocessing and parameter optimization, a smaller model can be a highly efficient solution for emotion classification in Indonesian text, especially in resource-constrained conditions.

References

[1] K. Mahor and A. K. Manjhvar, “Public Sentiment Assessment of Coronavirus-Specific Tweets using a Transformer-based BERT Classifier,” in 2022 International Conference on Edge Computing and Applications (ICECAA), IEEE, Oct. 2022, pp. 1559–1564. doi: 10.1109/ICECAA55415.2022.9936448.

[2] A. K. J, E. Cambria, and T. E. Trueman, “Transformer-Based Bidirectional Encoder Representations for Emotion Detection from Text,” in 2021 IEEE Symposium Series on Computational Intelligence (SSCI), IEEE, Dec. 2021, pp. 1–6. doi: 10.1109/SSCI50451.2021.9660152.

[3] R. F. Reza, Muhmmad Thoriq, and Rd. Imam Saepul Millah, “Sentiment Analysis of Marketplace Review with Islamic Perspective using Fine-Tuning DistilBERT,” Khazanah Journal of Religion and Technology, vol. 2, no. 2, pp. 45–54, Jan. 2025, doi: 10.15575/kjrt.v2i2.1118.

[4] S. Jayanthi and S. S. Arumugam, “Advancing Emotion Detection in a Text of Transformer-Based Models and Traditional Classifiers,” in 2024 Asian Conference on Intelligent Technologies (ACOIT), IEEE, Sep. 2024, pp. 1–5. doi: 10.1109/ACOIT62457.2024.10939623.

[5] N. A. P. Masaling, R. R. Siswanto, and A. S. Girsang, “Indonesian Tweet Emotion Detection Using IndoBERT,” in 2024 International Conference on Information Management and Technology (ICIMTech), IEEE, Aug. 2024, pp. 478–482. doi: 10.1109/ICIMTech63123.2024.10780847.

[6] F. A. Acheampong, H. Nunoo-Mensah, and W. Chen, “Recognizing Emotions from Texts Using an Ensemble of Transformer-Based Language Models,” in 2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), IEEE, Dec. 2021, pp. 161–164. doi: 10.1109/ICCWAMTIP53232.2021.9674102.

[7] K. Hulliyah, F. Rayyan, and N. S. A. A. Bakar, “Development Of A Chatbot For The Online Application Telegram Chat With An Approach To The Emotion Classification Text Using The Indobert-Lite Method,” in 2022 4th International Conference on Cybernetics and Intelligent System (ICORIS), IEEE, Oct. 2022, pp. 1–4. doi: 10.1109/ICORIS56080.2022.10031483.

[8] A. Das, M. M. Hoque, O. Sharif, M. A. A. Dewan, and N. Siddique, “TEmoX: Classification of Textual Emotion Using Ensemble of Transformers,” IEEE Access, vol. 11, pp. 109803–109818, 2023, doi: 10.1109/ACCESS.2023.3319455.

[9] N. Mossad, Y. Mohamed, A. Fares, and A. B. Zaky, “Arabic text sentiment analysis and emotion classification using transformers,” in 2023 11th International Japan-Africa Conference on Electronics, Communications, and Computations (JAC-ECC), IEEE, Dec. 2023, pp. 131–137. doi: 10.1109/JAC-ECC61002.2023.10479609.

[10] C. Shaw, P. LaCasse, and L. Champagne, “Exploring emotion classification of indonesian tweets using large scale transfer learning via IndoBERT,” Soc. Netw. Anal. Min., vol. 15, no. 1, p. 22, Mar. 2025, doi: 10.1007/s13278-025-01439-6.

[11] M. K. Anam, “Improved Performance of Hybrid GRU-BiLSTM for Detection Emotion on Twitter Dataset,” Journal of Applied Data Sciences, vol. 6, no. 1, pp. 354–365, Jan. 2024, doi: 10.47738/jads.v6i1.459.

[12] A. de León Languré and M. Zareei, “Evaluating the Effect of Emotion Models on the Generalizability of Text Emotion Detection Systems,” IEEE Access, vol. 12, pp. 70489–70500, 2024, doi: 10.1109/ACCESS.2024.3401203.

[13] Riccosan, K. E. Saputra, G. D. Pratama, and A. Chowanda, “Emotion dataset from Indonesian public opinion,” Data Brief, vol. 43, p. 108465, Aug. 2022, doi: 10.1016/j.dib.2022.108465.

[14] G. Haixiang, L. Yijing, J. Shang, G. Mingyun, H. Yuanyue, and G. Bing, “Learning from class-imbalanced data: Review of methods and applications,” Expert Syst. Appl., vol. 73, pp. 220–239, May 2017, doi: 10.1016/j.eswa.2016.12.035.

[15] Ajitesh Kumar, “Transformer Architecture Types: Explained with Examples,” 2024.

[16] A. Jazuli, Widowati, and R. Kusumaningrum, “Optimizing Aspect-Based Sentiment Analysis Using BERT for Comprehensive Analysis of Indonesian Student Feedback,” Applied Sciences, vol. 15, no. 1, p. 172, Dec. 2024, doi: 10.3390/app15010172.

[17] C. I. V and S. K J, “Text-Based Emotion Recognition Using Deep Learning,” in 2024 Second International Conference on Advances in Information Technology (ICAIT), IEEE, Jul. 2024, pp. 1–7. doi: 10.1109/ICAIT61638.2024.10690782.

[18] V. K. Agbesi et al., “Pre-Trained Transformer-Based Models for Text Classification Using Low-Resourced Ewe Language,” Systems, vol. 12, no. 1, p. 1, Dec. 2023, doi: 10.3390/systems12010001.

[19] Arif Bijaksana Putra Negara, “The Influence Of Applying Stopword Removal And Smote On Indonesian Sentiment Classification,” Lontar Komputer : Jurnal Ilmiah Teknologi Informasi, vol. 14, no. 03, pp. 172–185, Oct. 2025, doi: 10.24843/LKJITI.2023.v14.i03.p05.

[20] A. A. Khan, O. Chaudhari, and R. Chandra, “A review of ensemble learning and data augmentation models for class imbalanced problems: Combination, implementation and evaluation,” Expert Syst. Appl., vol. 244, p. 122778, Jun. 2024, doi: 10.1016/j.eswa.2023.122778.

[21] F. Indriani, R. A. Nugroho, M. R. Faisal, and D. Kartini, “Comparative Evaluation of IndoBERT, IndoBERTweet, and mBERT for Multilabel Student Feedback Classification,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 8, no. 6, pp. 748–757, Dec. 2024, doi: 10.29207/resti.v8i6.6100.

[22] Muhamad Ridwan and Ema Utami, “An Optimized Hyperparameter Tuning for Improved Hate Speech Detection with Multilayer Perceptron,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 8, no. 4, pp. 525–534, Aug. 2024, doi: 10.29207/resti.v8i4.5949.