Multi-task Learning for Named Entity Recognition and Intent Classification in Natural Language Understanding Applications
Downloads
Background: Understanding human language is a part of the research in Natural Language Processing (NLP) known as Natural Language Understanding (NLU). It becomes a crucial part of some NLP applications such as chatbots, that interpret the user intent and important entities. NLU systems depend on intent classification and named entity recognition (NER) which is crucial for understanding the user input to extract meaningful information. Not only important in chatbots, NLU also provides a pivotal function in other applications for efficient and precise text understanding.
Objective: The aim of this study is to introduce multitask learning techniques to improve the application's performance on NLU tasks, especially intent classification and NER in specific domains.
Methods: To achieve the language understanding capability, a strategy is to combine the intent classification and entity recognition tasks by using a shared model based on the shared representation and task dependencies. This approach is known as multitask learning and leverages the collaborative interaction between these related tasks to enhance performance. The proposed learning architecture is designed to be adaptable to various NLU-based applications, but in this work are discussed use cases in chatbots.
Results: The results show the effectiveness of the proposed approach by following several experiments, both from intent classification and named entity recognitions. The multitask learning capabilities highlight the potential of multi-task learning in chatbot systems for close domains. The optimal hyperparameters consist of a warm-up step of 60, an early stopping probability of 10, a weight decay of 0.001, a Named Entity Recognition (NER) loss weight of 0.58, and an intention classification loss weight of 0.4.
Conclusion: The performance of Dual Intent and Entity Transformer (DIET) for both tasks—intent classification and named entity recognition—is highly dependent on the data. This leads to various capabilities for the hyperparameter combinations. Our proposed model architecture significantly outperforms previous studies based on common evaluation metrics.
Keywords: Natural Language Understanding, Chatbot, Multi-task Learning, Named Entity Recognition
A. Agarwal, S. Maiya, and S. Aggarwal, “Evaluating Empathetic Chatbots in Customer Service Settings,” arXiv e-prints, p. arXiv:2101.01334, Jan. 2021, doi: 10.48550/arXiv.2101.01334.
E. W. T. Ngai, M. C. M. Lee, M. Luo, P. S. L. Chan, and T. Liang, “An intelligent knowledge-based chatbot for customer service,” Electron Commer Res Appl, vol. 50, p. 101098, 2021, doi: https://doi.org/10.1016/j.elerap.2021.101098.
L. Nicolescu and M. T. Tudorache, “Human-Computer Interaction in Customer Service: The Experience with AI Chatbots—A Systematic Literature Review,” Electronics (Basel), vol. 11, no. 10, 2022, doi: 10.3390/electronics11101579.
L. Athota, V. K. Shukla, N. Pandey, and A. Rana, “Chatbot for Healthcare System Using Artificial Intelligence,” in 2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), 2020, pp. 619-622. doi: 10.1109/ICRITO48877.2020.9197833.
L. Xu, L. Sanders, K. Li, and J. C. L. Chow, “Chatbot for health care and oncology applications using artificial intelligence and machine learning: Systematic review,” JMIR Cancer, vol. 7, no. 4, p. e27850, Nov. 2021.
P. Kandpal, K. Jasnani, R. Raut, and S. Bhorge, “Contextual Chatbot for Healthcare Purposes (using Deep Learning),” in 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4), 2020, pp. 625–634. doi: 10.1109/WorldS450073.2020.9210351.
M. Mateos-Sanchez, A. C. Melo, L. S. Blanco, and A. M. F. García, “Chatbot, as Educational and Inclusive Tool for People with Intellectual Disabilities,” Sustainability, vol. 14, no. 3, 2022, doi: 10.3390/su14031520.
A. S. Sreelakshmi, S. B. Abhinaya, A. Nair, and S. Jaya Nirmala, “A Question Answering and Quiz Generation Chatbot for Education,” in 2019 Grace Hopper Celebration India (GHCI), 2019, pp. 1–6. doi: 10.1109/GHCI47972.2019.9071832.
X. Deng and Z. Yu, “A Meta-Analysis and Systematic Review of the Effect of Chatbot Technology Use in Sustainable Education,” Sustainability, vol. 15, no. 4, 2023, doi: 10.3390/su15042940.
J.-W. Chang, N. Yen, and J. C. Hung, “Design of a NLP-empowered finance fraud awareness model: the anti-fraud chatbot for fraud detection and fraud classification as an instance,” J Ambient Intell Humaniz Comput, vol. 13, no. 10, pp. 4663–4679, 2022, doi: 10.1007/s12652-021-03512-2.
M. Ridha and K. Haura Maharani, “Implementation of Artificial Intelligence Chatbot in Optimizing Customer Service in Financial Technology Company PT. FinAccel Finance Indonesia,” Proc West Mark Ed Assoc Conf, vol. 83, no. 1, 2022, doi: 10.3390/proceedings2022083021.
D. Fotheringham and M. A. Wiles, “The effect of implementing chatbot customer service on stock returns: an event study analysis,” J Acad Mark Sci, vol. 51, no. 4, pp. 802–822, 2023, doi: 10.1007/s11747-022-00841-2.
R. S. Perdana, P. P. Adikara, Indriati, and D. Kurnianingtyas, “Knowledge-Enriched Domain Specific Chatbot on Low-resource Language,” in 2022 11th Electrical Power, Electronics, Communications, Controls and Informatics Seminar (EECCIS), IEEE, Aug. 2022, pp. 310–315. doi: 10.1109/EECCIS54468.2022.9902930.
B. Galitsky, “Chatbot Components and Architectures,” in Developing Enterprise Chatbots, Cham: Springer International Publishing, 2019, pp. 13–51. doi: 10.1007/978-3-030-04299-8_2.
E. Adamopoulou and L. Moussiades, “An Overview of Chatbot Technology,” in AIAI 2020: Artificial Intelligence Applications and Innovations, 2020, pp. 373–383. doi: 10.1007/978-3-030-49186-4_31.
W. M. A. F. W. Hamzah, M. K. Yusof, I. Ismail, M. Makhtar, H. Nawang, and A. A. Aziz, “Multiclass Intent Classification for Chatbot Based on Machine Learning Algorithm,” in 2022 Seventh International Conference on Informatics and Computing (ICIC), 2022, pp. 1–6. doi: 10.1109/ICIC56845.2022.10006979.
I. Jauregi Unanue, E. Zare Borzeshi, and M. Piccardi, “Recurrent neural networks with specialized word embeddings for health-domain named-entity recognition,” J Biomed Inform, vol. 76, pp. 102–109, 2017, doi: https://doi.org/10.1016/j.jbi.2017.11.007.
H. Liang and H. Li, “Towards Standard Criteria for human evaluation of Chatbots: A Survey,” May 2021.
R. Caruana, “Multitask Learning,” Mach Learn, vol. 28, no. 1, pp. 41–75, 1997, doi: 10.1023/A:1007379606734.
R. Hu and A. Singh, “UniT: Multimodal Multitask Learning with a Unified Transformer,” Feb. 2021.
L. Saitta, European Coordinating Committee for Artificial Intelligence., and Associazione italiana per l’intelligenza artificiale., Machine learning : proceedings of the Thirteenth International Conference (ICML ’96). Morgan Kaufmann Publishers, 1996. Accessed: Jul. 27, 2023. [Online]. Available: https://dl.acm.org/doi/abs/10.5555/3091696.3091708
R. Collobert and J. Weston, “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning,” in Proceedings of the 25th international conference on Machine learning - ICML ’08, New York, New York, USA: ACM Press, 2008, pp. 160–167. doi: 10.1145/1390156.1390177.
S. Ruder, “An Overview of Multi-Task Learning in Deep Neural Networks,” Jun. 2017, Accessed: Jul. 27, 2023. [Online]. Available: https://arxiv.org/abs/1706.05098
S. Rizou, A. Paflioti, A. Theofilatos, A. Vakali, G. Sarigiannidis, and K. Ch. Chatzisavvas, “Multilingual Name Entity Recognition and Intent Classification employing Deep Learning architectures,” Simul Model Pract Theory, vol. 120, p. 102620, Nov. 2022, doi: 10.1016/j.simpat.2022.102620.
G. Di Gennaro, A. Buonanno, A. Di Girolamo, A. Ospedale, and F. A. N. Palmieri, “Intent Classification in Question-Answering Using LSTM Architectures,” in Progresses in Artificial Intelligence and Neural Systems, 2021, pp. 115–124. doi: 10.1007/978-981-15-5093-5_11.
C. O. Bilah, T. B. Adji, and N. A. Setiawan, “Intent Detection on Indonesian Text Using Convolutional Neural Network,” in 2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom), IEEE, Jun. 2022, pp. 174–178. doi: 10.1109/CyberneticsCom55287.2022.9865291.
H. B. Hashemi, A. Asiaee, and R. Kraft, “Query Intent Detection using Convolutional Neural Networks,” in Proc. Int. Conf. Web Search Data Mining, Workshop Query Understanding, 2016. doi: 10.1145/1235.
A. Benayas, R. Hashempour, D. Rumble, S. Jameel, and R. C. De Amorim, “Unified Transformer Multi-Task Learning for Intent Classification With Entity Recognition,” IEEE Access, vol. 9, pp. 147306–147314, 2021, doi: 10.1109/ACCESS.2021.3124268.
S. Surana, J. Chekkala, and P. Bihani, “Chatbot based Crime Registration and Crime Awareness System using a custom Named Entity Recognition Model for Extracting Information from Complaints,” International Research Journal of Engineering and Technology, 2021, [Online]. Available: www.irjet.net
N. Ali, “Chatbot: A Conversational Agent employed with Named Entity Recognition Model using Artificial Neural Network,” Jun. 2020, Accessed: Jul. 28, 2023. [Online]. Available: https://arxiv.org/abs/2007.04248
T. Bauer, E. Devrim, M. Glazunov, W. L. Jaramillo, B. Mohan, and G. Spanakis, “#MeTooMaastricht: Building a Chatbot to Assist Survivors of Sexual Harassment,” Springer, Cham, 2020, pp. 503–521. doi: 10.1007/978-3-030-43823-4_41.
D. Christianto, E. Siswanto, and R. Chaniago, “Penggunaan Named Entity Recognition dan Artificial Intelligence Markup Language untuk Penerapan Chatbot Berbasis Teks,” Jurnal Telematika, vol. 10, no. 2, p. 8, 2015, Accessed: Jul. 28, 2023. [Online]. Available: https://journal.ithb.ac.id/telematika/article/view/130
J. Li, A. Sun, J. Han, and C. Li, “A Survey on Deep Learning for Named Entity Recognition,” IEEE Trans Knowl Data Eng, vol. 34, no. 1, pp. 50–70, Jan. 2022, doi: 10.1109/TKDE.2020.2981314.
T. Bunk, D. Varshneya, V. Vlasov, and A. Nichol, “DIET: Lightweight Language Understanding for Dialogue Systems,” Apr. 2020, [Online]. Available: http://arxiv.org/abs/2004.09936
J. Thakkar, P. Raut, Y. Doshi, and K. Parekh, “Erasmus-AI Chatbot,” 2018, [Online]. Available: www.ijcseonline.org
T. T. Nguyen, A. D. Le, H. T. Hoang, and T. Nguyen, “NEU-chatbot: Chatbot for admission of National Economics University,” Computers and Education: Artificial Intelligence, vol. 2, Jan. 2021, doi: 10.1016/j.caeai.2021.100036.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” Oct. 2018, [Online]. Available: http://arxiv.org/abs/1810.04805
R. Caruana, “Multitask Learning,” Mach Learn, vol. 28, no. 1, pp. 41–75, 1997, doi: 10.1023/A:1007379606734.
Z. Zhang, W. Yu, M. Yu, Z. Guo, and M. Jiang, “A Survey of Multi-task Learning in Natural Language Processing: Regarding Task Relatedness and Training Methods,” in Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, Stroudsburg, PA, USA: Association for Computational Linguistics, 2023, pp. 943–956. doi: 10.18653/v1/2023.eacl-main.66.
H. Su et al., “Learning to Generate Prompts for Dialogue Generation through Reinforcement Learning,” 2022.
J. Bharadiya, “Transfer Learning in Natural Language Processing (NLP),” European Journal of Technology, vol. 7, no. 2, pp. 26–35, Jun. 2023, doi: 10.47672/ejt.1490.
I. T. Aksu, N. F. Chen, L. F. D’Haro, and R. E. Banchs, “Reranking of Responses Using Transfer Learning for a Retrieval-Based Chatbot,” 2021, pp. 239–250. doi: 10.1007/978-981-15-9323-9_20.
V. Ilievski, C. Musat, A. Hossman, and M. Baeriswyl, “Goal-Oriented Chatbot Dialog Management Bootstrapping with Transfer Learning,” in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, California: International Joint Conferences on Artificial Intelligence Organization, Jul. 2018, pp. 4115–4121. doi: 10.24963/ijcai.2018/572.
F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” 2020.
B. Galitsky, “Chatbot Components and Architectures,” in Developing Enterprise Chatbots, Cham: Springer International Publishing, 2019, pp. 13–51. doi: 10.1007/978-3-030-04299-8_2.
C. O. Bilah, T. B. Adji, and N. A. Setiawan, “Intent Detection on Indonesian Text Using Convolutional Neural Network,” in 2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom), IEEE, Jun. 2022, pp. 174–178. doi: 10.1109/CyberneticsCom55287.2022.9865291.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” Oct. 2018, [Online]. Available: http://arxiv.org/abs/1810.04805
M. Henderson et al., “Training Neural Response Selection for Task-Oriented Dialogue Systems,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Stroudsburg, PA, USA: Association for Computational Linguistics, 2019, pp. 5392–5404. doi: 10.18653/v1/P19-1536.
T. Bunk, D. Varshneya, V. Vlasov, and A. Nichol, “DIET: Lightweight Language Understanding for Dialogue Systems,” Apr. 2020, [Online]. Available: http://arxiv.org/abs/2004.09936
Copyright (c) 2025 The Authors. Published by Universitas Airlangga.

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
All accepted papers will be published under a Creative Commons Attribution 4.0 International (CC BY 4.0) License. Authors retain copyright and grant the journal right of first publication. CC-BY Licenced means lets others to Share (copy and redistribute the material in any medium or format) and Adapt (remix, transform, and build upon the material for any purpose, even commercially).