Optimizing IndoBERT for Revised Bloom's Taxonomy Question Classification Using Neural Network Classifier

Lazuardy Syahrul Darfiansa; Fitriyani; Sza Sza Amulya Larasati

Authors

Lazuardy Syahrul Darfiansa
lazuardydarfiansa@gmail.com
Telkom University, Bandung, Indonesia https://orcid.org/0000-0002-8983-8162
Fitriyani Telkom University, Bandung, Indonesia https://orcid.org/0009-0002-7329-0769
Sza Sza Amulya Larasati Universitas Brawijaya, Malang, Indonesia https://orcid.org/0009-0007-7654-0512

Vol. 11 No. 2 (2025): June

Articles

July 22, 2025

Downloads

PDF

Abstract
How to Cite
Metrics
References
License

Background: A major challenge in Indonesian education system is the continued dominance of exam questions that primarily assess basic thinking skills, such as remembering and understanding. In order to effectively nurture students with critical, analytical, and creative thinking skills, the integration of higher-order thinking questions has become increasingly urgent. An effective conceptual framework that can be utilized in this regard is Revised Bloom's Taxonomy (BT). This framework classifies cognitive skills into 6 levels, namely remember, understand, apply, analyze, evaluate, and create. Furthermore, the framework is particularly important as it promotes the development of exam questions that transcend lower-level thinking skills, fostering a deeper and higher level of understanding among students. In this context, automated systems powered by deep learning (DL) have shown promising accuracy in classifying questions based on BT levels, thereby offering practical support for educators aiming to design more meaningful and intellectually stimulating assessments.

Objective: This research aims to develop a classification system that can effectively classify Indonesian exam questions based on BT using IndoBERT pretrained models. These models were combined with Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) classifiers (referred to as IndoBERT-CNN and IndoBERT-LSTM) to determine the model with the highest performance.

Methods: The dataset utilized was self-collected and underwent several stages of preparation, including expert labeling and splitting. Furthermore, preprocessing was conducted to ensure the dataset was consistent and free from irrelevant features related to case folding, tokenization, stopword removal, and stemming. Hyperparameter fine-tuning was subsequently carried out on IndoBERT, IndoBERT-CNN, and IndoBERT-LSTM. Model performance was evaluated using Accuracy, F-Measure, Precision, and Recall.

Results: The fine-tuned IndoBERT model results showed that IndoBERT-LSTM outperformed IndoBERT-CNN. The optimal hyperparameter configuration, batch size of 64 and learning rate of 5e-5, showed the highest performance, achieving Accuracy of 88.75%, Precision of 85%, Recall of 88%, and F-Measure of 86%.

Conclusion: IndoBERT, IndoBERT-CNN, and IndoBERT-LSTM reflected promising results, although the performance of the models was significantly affected by respective architectures and hyperparameter settings. Among the three observed models, IndoBERT was found to perform best with smaller batch sizes and moderate learning rates. IndoBERT-CNN achieved stronger results with a higher learning rate and similar batch sizes. IndoBERT-LSTM recorded the highest accuracy with larger batch sizes for gradient stability. However, IndoBERT was constrained by its focus on Indonesian language, and the interpretability of the predictions made, specifically in relation to expert-labeled data, remained unclear.

Keywords: Bloom’s Taxonomy, CNN, Hyperparameter Fine-Tuning, IndoBERT, LSTM, Question Classification

[1] F. Baharuddin and M. F. Naufal, “Fine-Tuning IndoBERT for Indonesian Exam Question Classification Based on Bloom’s Taxonomy,” Journal of Information Systems Engineering and Business Intelligence, vol. 9, no. 2, pp. 253–263, Nov. 2023, doi: 10.20473/jisebi.9.2.253-263.

[2] A. Aninditya, M. A. Hasibuan, and E. Sutoyo, “Text Mining Approach Using TF-IDF and Naive Bayes for Classification of Exam Questions Based on Cognitive Level of Bloom’s Taxonomy,” in 2019 IEEE International Conference on Internet of Things and Intelligence System (IoTaIS), IEEE, Nov. 2019, pp. 112–117. doi: 10.1109/IoTaIS47347.2019.8980428.

[3] S. K. Patil and M. M. Shreyas, “A Comparative Study of Question Bank Classification based on Revised Bloom’s Taxonomy using SVM and K-NN,” in 2017 2nd International Conference On Emerging Computation and Information Technologies (ICECIT), IEEE, Dec. 2017, pp. 1–7. doi: 10.1109/ICECIT.2017.8453305.

[4] A. S. Callista, O. Nurul Pratiwi, and E. Sutoyo, “Questions Classification Based on Revised Bloom’s Taxonomy Cognitive Level using Naive Bayes and Support Vector Machine,” in 2021 4th International Conference of Computer and Informatics Engineering (IC2IE), IEEE, Sep. 2021, pp. 260–265. doi: 10.1109/IC2IE53219.2021.9649187.

[5] M. İlhan and M. Gezer, “A comparison of the reliability of the Solo- and revised Bloom’s Taxonomy-based classifications in the analysis of the cognitive levels of assessment questions,” Pegem Eğitim ve Öğretim Dergisi, vol. 7, no. 4, pp. 637–662, Sep. 2017, doi: 10.14527/pegegog.2017.023.

[6] S. S. A. Larasati and F. A. Bachtiar, “Harnessing Residual Attention Networks for Stress Level Classification Using EEG Spectrograms,” in 2024 Ninth International Conference on Informatics and Computing (ICIC), IEEE, Oct. 2024, pp. 1–6. doi: 10.1109/ICIC64337.2024.10956441.

[7] Syahidah Sufi Haris and Nazlia Omar, “A rule-based approach in Bloom’s Taxonomy question classification through natural language processing,” IEEE, 2012.

[8] S. HUILAN et al., “Educational management in Critical Thinking Training Based on Bloom’s Taxonomy and SOLO Taxonomy,” in 2020 International Conference on Information Science and Education (ICISE-IE), IEEE, Dec. 2020, pp. 518–521. doi: 10.1109/ICISE51755.2020.00116.

[9] K. Jayakodi, M. Bandara, and I. Perera, “An automatic classifier for exam questions in Engineering: A process for Bloom’s taxonomy,” in 2015 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE), IEEE, Dec. 2015, pp. 195–202. doi: 10.1109/TALE.2015.7386043.

[10] S. F. Kusuma, D. Siahaan, and U. L. Yuhana, “Automatic Indonesia’s questions classification based on bloom’s taxonomy using Natural Language Processing a preliminary study,” in 2015 International Conference on Information Technology Systems and Innovation (ICITSI), IEEE, Nov. 2015, pp. 1–6. doi: 10.1109/ICITSI.2015.7437696.

[11] M. Mohammed and N. Omar, “Question classification based on Bloom’s taxonomy cognitive domain using modified TF-IDF and word2vec,” PLoS One, vol. 15, no. 3, p. e0230442, Mar. 2020, doi: 10.1371/journal.pone.0230442.

[12] M. Ifham, K. Banujan, B. T. G. S. Kumara, and P. M. A. K. Wijeratne, “Automatic Classification of Questions based on Bloom’s Taxonomy using Artificial Neural Network,” in 2022 International Conference on Decision Aid Sciences and Applications (DASA), IEEE, Mar. 2022, pp. 311–315. doi: 10.1109/DASA54658.2022.9765190.

[13] Hasmawati, A. Romadhony, and R. Abdurohman, “Primary and High School Question Classification based on Bloom’s Taxonomy,” in 2022 10th International Conference on Information and Communication Technology (ICoICT), IEEE, Aug. 2022, pp. 234–239. doi: 10.1109/ICoICT55009.2022.9914842.

[14] S. Shaikh, S. M. Daudpotta, and A. S. Imran, “Bloom’s Learning Outcomes’ Automatic Classification Using LSTM and Pretrained Word Embeddings,” IEEE Access, vol. 9, pp. 117887–117909, 2021, doi: 10.1109/ACCESS.2021.3106443.

[15] L. S. Darfiansa, F. Azzuri, F. A. Bachtiar, and D. E. Ratnawati, “Comparative Analysis of Deep Learning and Machine Learning Techniques for Question Classification in Bloom’s Taxonomy,” in 2023 1st International Conference on Advanced Engineering and Technologies (ICONNIC), IEEE, Oct. 2023, pp. 103–108. doi: 10.1109/ICONNIC59854.2023.10467502.

[16] M. O. Gani, R. K. Ayyasamy, A. Sangodiah, and Y. T. Fui, “Bloom’s Taxonomy-based exam question classification: The outcome of CNN and optimal pre-trained word embedding technique,” Educ Inf Technol (Dordr), vol. 28, no. 12, pp. 15893–15914, Dec. 2023, doi: 10.1007/s10639-023-11842-1.

[17] H. Sharma, R. Mathur, T. Chintala, S. Dhanalakshmi, and R. Senthil, “An effective deep learning pipeline for improved question classification into bloom’s taxonomy’s domains,” Educ Inf Technol (Dordr), vol. 28, no. 5, pp. 5105–5145, May 2023, doi: 10.1007/s10639-022-11356-2.

[18] E. R. Setyaningsih and I. Listiowarni, “Categorization of Exam Questions based on Bloom Taxonomy using Naïve Bayes and Laplace Smoothing,” in 2021 3rd East Indonesia Conference on Computer and Information Technology (EIConCIT), IEEE, Apr. 2021, pp. 330–333. doi: 10.1109/EIConCIT50028.2021.9431862.

[19] N. Barari, M. RezaeiZadeh, A. Khorasani, and F. Alami, “Designing and validating educational standards for E-teaching in virtual learning environments (VLEs), based on revised Bloom’s taxonomy,” Interactive Learning Environments, vol. 30, no. 9, pp. 1640–1652, Oct. 2022, doi: 10.1080/10494820.2020.1739078.

[20] ANBUSELVAN SANGODIAH, ROHIZA AHMAD, and WAN FATIMAH WAN AHMAD, “TAXONOMY BASED FEATURES IN QUESTION CLASSIFICATION USING SUPPORT VECTOR MACHINE,” J Theor Appl Inf Technol, 2017.

[21] A. Sangodiah, T. Jee San, Y. Tien Fui, L. Ean Heng, R. K. Ayyasamy, and N. A Jalil, “Identifying Optimal Baseline Variant of Unsupervised Term Weighting in Question Classification Based on Bloom Taxonomy,” MENDEL, vol. 28, no. 1, pp. 8–22, Jun. 2022, doi: 10.13164/mendel.2022.1.008.

[22] L. S. Darfiansa and F. A. Bachtiar, “Comparative Analysis of Term Weighting Methods for Question Classification in Bloom Taxonomy Using Machine Learning Approach,” in 2023 IEEE International Conference on Computing (ICOCO), IEEE, Oct. 2023, pp. 259–264. doi: 10.1109/ICOCO59262.2023.10397821.

[23] M. Mohammed and N. Omar, “Question Classification Based on Bloom’s Taxonomy Using Enhanced TF-IDF,” Int J Adv Sci Eng Inf Technol, vol. 8, no. 4–2, pp. 1679–1685, Sep. 2018, doi: 10.18517/ijaseit.8.4-2.6835.

[24] E. Subiyantoro, A. Ashari, and Suprapto, “Cognitive Classification Based on Revised Bloom’s Taxonomy Using Learning Vector Quantization,” in 2020 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM), IEEE, Nov. 2020, pp. 349–353. doi: 10.1109/CENIM51130.2020.9297879.

[25] DHUHA ABDULHADI ABDULJABBAR and NAZLIA OMAR, “EXAM QUESTIONS CLASSIFICATION BASED ONBLOOM’S TAXONOMY COGNITIVE LEVEL USINGCLASSIFIERS COMBINATION,” J Theor Appl Inf Technol, 2015.

[26] W. Wei, X. Li, B. Zhang, L. Li, R. Damaševičius, and R. Scherer, “LSTM-SN: complex text classifying with LSTM fusion social network,” J Supercomput, vol. 79, no. 9, pp. 9558–9583, Jun. 2023, doi: 10.1007/s11227-022-05034-w.

[27] R. Li, W. Yu, Q. Huang, and Y. Liu, “Patent Text Classification based on Deep Learning and Vocabulary Network,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 1, 2023, doi: 10.14569/IJACSA.2023.0140107.

[28] M. Umer et al., “Impact of convolutional neural network and FastText embedding on text classification,” Multimed Tools Appl, vol. 82, no. 4, pp. 5569–5585, Feb. 2023, doi: 10.1007/s11042-022-13459-x.

[29] G. DOU, K. ZHAO, M. GUO, and J. MOU, “MEMRISTOR-BASED LSTM NETWORK FOR TEXT CLASSIFICATION,” Fractals, vol. 31, no. 06, Jan. 2023, doi: 10.1142/S0218348X23400406.

[30] Z. Zhai, X. Zhang, F. Fang, and L. Yao, “Text classification of Chinese news based on multi-scale CNN and LSTM hybrid model,” Multimed Tools Appl, vol. 82, no. 14, pp. 20975–20988, Jun. 2023, doi: 10.1007/s11042-023-14450-w.

[31] X. Chen, P. Cong, and S. Lv, “A Long-Text Classification Method of Chinese News Based on BERT and CNN,” IEEE Access, vol. 10, pp. 34046–34057, 2022, doi: 10.1109/ACCESS.2022.3162614.

[32] F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” Nov. 2020.

[33] K. O’Shea and R. Nash, “An Introduction to Convolutional Neural Networks,” Dec. 2015.

[34] C. B. Vennerød, A. Kjærran, and E. S. Bugge, “Long Short-term Memory RNN,” May 2021.

[35] V. S. Senthil Kumar and S. Shahraz, “Intraclass correlation for reliability assessment: the introduction of a validated program in SAS (ICC6),” Health Serv Outcomes Res Methodol, vol. 24, no. 1, pp. 1–13, Mar. 2024, doi: 10.1007/s10742-023-00299-x.

Optimizing IndoBERT for Revised Bloom's Taxonomy Question Classification Using Neural Network Classifier

Authors

Downloads

Login

SJR

Editorial Policies

Instruction For Author

Article Templates and Instructions

Accreditation Certificate

Citation Analysis

visitors

Visitors

Indexed In

Indexed In

Twitter

Address

Contact Info: